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Moat  of  tire  designs,  analyses,  and  optimisation  techniques  available  In  response 
surface  nietiiodology  concern  quantitative  responses  where  the  usual  assumptions  are 
normal  errors  with  erjual  variances.  Furtlierniore,  Che  fitted  models  are  assumed  to 
be  linear.  In  many  experimental  situations,  the  data  do  not  follow  this  pattern  of 
assumptions.  In  tliis  case,  tlie  use  of  trnditioiiai  response  surface  techniques  would  be 
inappropriate.  On  the  other  hand,  generalized  linear  models  (GLM)  can  accomodate 
a wider  variety  of  responses  (both  qualitative  and  quantitative)  while  requiring  less 


In  this  dissertation,  I extend  the  standard  GLM  procedure  for  estimating  parame- 
ters (i.e.  the  one  based  on  first-order  approximation)  to  tlie  one  wliere  parameters  in 
a model  arc  astimsted  using  a second-order  Taylor  series  approximation.  The  purpose 


of  this  extension  is  to  reduce  tlie  magnitude  of  the  prediction  varisnce  associated  with 
the  mean  response.  Two  different  methods  for  obtaining  more  precise  estimates  of 
the  parameters  are  deveioped  and  roinpared  with  the  standard  method. 

Next,  we  add  to  the  receni  progress  that  has  been  made  in  the  construction  of 
lack  of  fit  tests  for  the  lo^stlr  model.  An  approximate  lack  of  fit  test  statistic  is 
introduited  for  use  in  situations  involving  normal  errors  witli  unequal  variances.  The 
robustiress  of  tlie  test  statistic  is  also  explored  under  different  designs. 

Optimization  of  the  response  is  an  important  feature  of  response  surface  method- 
ology. A common  melliod  used  for  maximizing  (or  minimizing)  a response  is  ridge 
analysis.  More  recently,  a modification  of  ridge  analysis  iias  been  developed  and  is 
called  modified  ridge  analysis.  In  this  dissertation,  methods  are  derived  for  response 
optimization  using  generalized  linear  models. 

Optimal  designs  Imve  been  developed  in  tlie  case  of  a binary  response  for  the  one- 
variable  case..  In  this  work,  several  criteria  are  used  for  generating  optimal  designs 
under  any  generalized  linear  model.  Since  the  design  depends  on  the  parameters 
in  the  model  in  this  situation,  both  saiiueniial  and  Bayesian  sequential  methods  are 
utilized  to  obtain  optimal  design. 


CHAPTER  1 


INTRODUCTION 


Response  Surface  Methodology  <‘onsisCs  of  a sol.  of  techniques  that  includes 

(i)  designing  a series  of  experiments  that  will  result  in  adequate  and  reliable  measure* 
meiit  of  the  respoitse, 

(ii)  obtaining  a model  that  best  liLs  the  data  obtained  in  (i),  and 

(iii)  determining  the  levels  of  the  experitnental  factors  that  optimize  (maximize  or 
miuimize)  the  predicted  response. 

A general  linear  model  in  a response  surface  situation  is  given  by 

V = X!3  + (.  (l.U) 

where  AT  is  an  n x p matrix  of  design  settings  and  functions  of  these  settiup,  ^ is  a 
p X 1 vector  of  unknown  parameters,  y denotes  the  vector  of  observed  responses,  and 
e is  a randoiii  error  vector. 

Most  of  the  design,  analyses,  and  uptiinization  techniques  available  io  response 
surfaco  methodology  assume  continuous  quautitative  responses  where  the  usual  as- 
sumptions are  iionnai  errors  possessing  equal  error  variances.  Furthermore,  tlie  fitted 
models  are  assumed  to  be  linear  in  the  parameters.  In  many  experbnental  situations, 
however,  the  data  do  not  follow  this  pattern  of  assumptions.  For  example,  in  some 
biometric  applications,  the  response  may  be  binary  or  it  may  represent  count  data 
such  as  the  number  of  deaths  due  to  a particular  disease.  In  this  case,  the  use  of 


traditional  response  surface  technii]ups  wniild  he  inappropriate.  On  the  other  hand, 
generalized  linear  morlela  (GtM)  can  accomodate  a wider  variety  of  response  types 
(hath  qualitative  and  quantitative)  while  requiring  less  rigorous  aasiiroptions. 

Mead  and  Pike  (1975)  reviewed  the  developments  in  response  surface  methodol- 
ogy pertaining  to  applied  re.search  with  einpliasis  on  biometric  applicatioru;.  Box  and 
Wilson  (1951)  considered  the  use  of  polynomial  models  to  approximate  the  mean  re- 
spoirse.  They  were  the  hrst  ones  to  introduce  the  concept  of ‘composite*  designs.  Box 
and  Lucas  (1959)  were  tire  first  to  consider  fitting  a general  nonlinear  model  involving 
k variables  and  p parameters.  Box  and  Cox  (1964)  discussed  transrormation  of  data 
values  using  a puwer  family  transformation  of  tlie  response.  Generally,  a transforma- 
tion which  yields  additive  effects  in  tlie  rrystematic  component  of  the  model  as  well 
as  constancy  of  variances  of  the  random  error  is  desired.  Nelder  (1966)  introduced 
inverse  polynomials  as  response  functions.  In  196S,  Atkinson  and  Hunter  developed 
experimental  designs  for  parameter  estimation  in  nonlinear  models  and.  Box  (1971) 
considered  tire  amount  of  bias  in  the  least  square  estimates  of  the  parameters  in  a 
non-linear  model. 

Nelder  and  Wedderbum  (1972)  were  the  first  to  introduce  the  concept  of  general- 
ized linear  models.  A generalized  linear  model  is  specified  by: 

(i)  independent  observations  Vi,..  .,y„  liaving  a certain  probability  distribution  J{y,9) 
that  belongs  to  the  exponential  fatnily  with  parameters  6. 

(ii)  a set  of  explanatory  variables,  for  eadi  observation  describing  the 

systematic  component 

where  ^ is  an  unknown  parameter  vector  and  <t>  is  linear  in  the  elements  of  0 and  can 
be  expressed  as  f'{xi)0,  where  Zi  = (i,i,.-.,z,t)'  and  f'{zi)  is  a vector  of  powers 
and  cross  products  of  powers  of  the  ar,j's,  i = l,...,n,  y = 1,. t.  The  component 
T\  U referred  to  as  a linear  predictor. 


(iiij  a link  function  g(fi,)  relating  to  yt,,  the  I'th  mean  response,  that  is  >),  s giin), 
where  g(  ) is  a strictly  monotone  and  differentiable  function. 

The  most  commonly  used  link  ruuctions  are  : 

(i)  probit,  ry,  = ♦“'(yr,)  where  r^(-)  U the  rumnlative  distribution  function  of  the 
standard  normal  variate 

(11)  log  function,  ry,  = log(yi,),  corresponding  to  a luglinear  model 
(Hi)  logit,  ry,  = log(-j^). 

The  usual  approach,  when  the  variances  of  the  response  vari^le  are  not  homoge- 
nous, is  to  Rod  a suit^lc  transfonnacion  of  Che  response  such  Chat  the  assumption 
of  equal  variances  holds  for  the  transformed  variate.  An  ideal  transfomiation  is  one 
that  aatisRes  the  following  two  requirements: 

(1)  the  variance  of  the  transformed  variate  is  unaffected  by  dianges  in  mean  response. 

(2)  The  effects  of  the  explanatory  variablts  on  the  response  under  the  transformed 
scale  are  linear  and  additive  (no  interaetlnn). 

Violations  of  either  of  the  assumptions  inRuence  the  probability  levels  of  the  tests  of 
.signiRcaiice.  Variances  of  the  cransformetl  variate  should  be  unaffected  by  clianges  in 
the  mean  to  hold  the  assumption  of  equal  varainces  for  performing  the  usual  F-tests, 
since  tests  for  equality  of  means,  assuming  con.uont  varhtnera,  will  be  insensitive. 
With  nonadditivity,  there  is  an  interaction  term  in  the  model.  Sinre  both  Che  inter- 
action and  the  error  vary  together,  it  would  be  impossible  to  separate  the  two  effects, 
or  in  other  words,  the  effects  will  he  confounded.  Therefore  the  expected  value  of  the 
residual  sum  of  squares  will  be 

error  variance -s  (interaction  effect)’ 

so  that  no  unbiased  estimate  of  the  experimental  error  variance  exists. 

A single  transformatiun  may  not  satisfy  both  requirements.  Nelder  and  Wedder- 
burn  (1972)  showed  Chat  a square  root  transformation  of  the  response  was  needed 


for  the  homogfnfity  of  variances  property  to  hold  for  the  Tuberculin  data  of  Fisher 
(1949),  while  the  additivity  of  tlie  model  is  produced  by  a log  transformation  of  the  re- 
sponse. An  advantage  of  the  generalized  linear  model  approach  is  that  the  additivity 
property  can  be  achieved  using  the  link  function  independently  of  any  transformation 
of  the  data  values  that  can  result  In  constant  variances.  For  example,  in  tlie  Fislier 
data,  the  analysis  using  log-link  tu  with  Var{y,)  ce  p,  produced  the  same  result  as 
the  analysis  of  with  Var(yj^)  = constant,  using  log-link  (McCullagfa  and  Nelder, 
1989,  p.  378). 

Within  the  framework  of  generalized  linear  models,  the  fitting  of  the  hypothesized 
model  involves  the  following  : 

(1)  specifying  the  correct  error  distribution, 

(ii)  determining  tlie  relevant  systematic  compoiient  for  the  linear  predictor  using  some 
lack  of  Ht  test,  and 

(ifi)  defining  the  proper  link  function  s{-)- 

Khuri  (199i)  proposed  the  use  of  response  surface  methodology  under  generalized 
linear  models  for  determination  of  opiiuium  conditions  of  the  explanatory  variables 
and  the  development  of  optimal  designs. 

In  the  so-called  standard  generalized  linear  model  fitting  procedure,  a first-order 
approximation  of  y{yi),i  = l,...n,  is  used.  An  extension  of  this  standard  method 
using  a second-order  approximation  is  proposed  in  Chapter  Two.  An  advantage  of 
tills  extension  is  a reduction  in  both  the  standard  errors  of  tlie  parameter  estimates 
as  well  as  the  prediction  variance. 

Prepbon  (1980)  was  the  first  to  discuss  the  adequacy  of  the  hypotliesized  link 
function  used  in  fitting  a generalized  linear  model.  He  presented  tests  assuming  chat 
the  error  distribution  is  correct  and  the  variables  ini  luded  in  the  systematic  compo- 
nent are  determined  properly.  The  test  for  tlie  goodness  of  fit  for  the  hypotheiized 
link  wa.s  based  on  tlie  deviance  statistic.  The  deviance  statistic  is  defined  as  twice  the 


difference  between  die  maximum  achievable  log  likelihood  and  the  maximum  of  the 
log  likelihood  under  the  assumption  that  a given  model  holds.  Prepbon’s  procedure 
is  useful  only  under  certain  essuuiptloiis,  for  example,  the  true  unknown  link  func- 
tion has  to  be  a member  of  the  family  of  link  functions  under  cousideration.  Later 
Taiatis  (1981)  formulated  a goodness  of  fit  procedure  by  partitioning  the  covariate 
data  points  into  disjoint  subsets.  He  used  a logistic  regression  model  for  the  test.  The 
drawback  of  this  method  is  that  different  partitions  may  lead  to  different  conclusions. 
Su  and  Wei  (1991)  suggested  a lack  of  fit  test  based  ou  the  supremum  of  the  partial 
sums  of  residuals.  The  mostly  used  gondiiess  of  fit  test  of  logistic  regression  is  the 
Hnsmer-Lerncahow  teat  (Hosmer  and  Lemesliow,  1980). 

Previously,  attention  was  given  to  testing  lack  of  fit  iu  a response  surface  situation 
under  the  assumption  that  the  error  variances  are  equal.  The  classical  lack  of  fit  test  is 
based  on  an  exact  F-siatistic  which  requires  the  availability  of  replicated  observations 
at  some  points  in  the  experimental  region  (Khuri  and  Cornell,  1996).  Neill  and 
Johnson  (1985)  proposed  an  approximate  lack  of  fit  test  based  on  near  replicates. 
Tlieir  test  was  modified  by  Christensen  (1989)  to  obtain  an  exact  F-iest  based  on 
near  or  exact  replicates.  The  same  cannot  be  said  about  the  development  of  lack  of 
lit  tests  when  the  error  variances  are  not  equal.  In  Chapter  Three,  an  approximate 
lack  of  fit  test  in  case  of  heterngenous  error  variances  is  proposed. 

Chapter  Four  of  this  dissertation  deals  with  response  optimiaation.  For  a limt- 
order  model,  tlie  method  uf  steepest  ascent  (decent)  can  be  used  to  seek  a region  that 
contains  the  settings  of  the  explanatory  variables  that  produce  the  optimum  response 
(Box  and  Wilson,  1951).  The  dlmtlon  of  steepest  ascent  (decent)  depends  on  the 
sealed  units  of  the  model’s  input  (control)  variables.  A derivative-free  ojilimisation 
metliod  was  generalieed  by  Olsson  and  Nelson  (1975)  who  used  the  Nelder-Mead 
simplex  inetliod  for  function  minimiaation.  Carter,  Wampler,  Crews  and  Howells 


6 

(1977)  used  the  simplex  method  for  the  determination  of  the  treatment  levels  which 
optimize  the  probability  of  a favorable  response  in  a lo^stir.  regression  situation. 

A popular  method  for  response  optimisation  is  ridge  analysis  (Draper,  1963).  In 
the  standard  procedure  of  ridge  analysis,  however,  the  design  used  may  result  in 
large  variations  in  the  valuts  of  the  prediction  variance  on  a given  sphere.  Khuri  and 
Myers  (1979)  modiKed  this  procedure  by  imposing  certain  constraints  on  the  size  of 
the  prediction  variance.  Tliis  modihcatiun  is  extended  to  situatious  involving  the  use 
of  linear  models  with  heterogenous  error  variances  as  well  as  the  more  general  case  cf 
generalized  linear  models. 

There  arc  two  scliools  of  thought  for  the  development  of  optimal  desigim  in  re- 
sponse surface  methodology.  Let  i;(x)  denote  the  predicted  response  at  a point  x in 
a region  of  interest.  Kiefer  (1958,1959,1 9fiO,19G1 ,1962a, 1962b)  concentrated  munly 
on  developing  optimal  designs  based  on  minimizing  the  prediction  variance.  On  the 
other  hand,  Box  and  Draper  (1959,1982)  gave  more  attention  to  bias  iu  y(x).  Optimal 
designs  were  developed  based  on  minimizing  the  bias  suspected  to  be  present  in  tbe 
fitted  model.  Box  and  Lucas  (1959)  were  the  first  to  derive  optimal  designs  in  the  case 
of  non-linear  models.  They  proposed  a design  which  minimized  the  determinant  of 
the  approximate  variance-covariance  matrix  of  the  vector  of  least-squares  parameter 
estimates.  Tire  problem  with  this  design  .strategy  is  its  dependenry  on  the  unknown 
parameter  values.  A modilicatioii  was  proposed  by  Box  and  Hunter  (1965)  where 
design  points  were  sequentially  added  one  at  a time.  At  eacl)  stage,  new  parameter 
estimates  are  obtained  until  a stability  in  tiie  values  of  the  estimates  is  achieved. 

Abdelbasit  and  Plackett  (1983)  discussed  the  robustness  of  the  methods  to  find 
optimal  designs  using  initial  estimates  as  well  as  sequeutial  methods  in  the  case  of 
binary  data.  ALso,  in  a logistic  model  situation,  Wii  (1985)  provided  sequential  de- 
signs for  estimating  the  percentiles  of  the  response.  Qialonerond  Lamtz  (1989)  used 


Bnycsian  technique!:  Co  derlt'e  optimal  deeigiiK  Fur  lugisUt:  regression  in  first-order  mod- 
els. They  used  two  dilfereiii  criteria,  the  first  based  on  maximizing  the  determinant 
of  Che  moment  matrix  and  the  other  based  on  minimizing  the  prediction  variance. 

In  Chapter  Five,  two  methods  For  finding  optitnal  designs  For  generalized  linear 
models  are  presented.  Tlie  first  method  uses  set)uential  techniques  while  Che  sec- 
ond iitiiizes  Bayesian  methodology.  Tlie  two  methods  are  to  be  used  For  fitting  first 
and  second-order  response  surFace  models.  Four  diFTerent  optimality  criteria  are  pro- 
posed: D-optiinality  and  A-optiniality  criteria  for  generalized  linear  models  (Chaloner 
and  Lamtz,  1989),  Q-optimality  (Myers,  Myers  and  Carter,  1994)  and  Aj-optimnlity 
(Jones  and  Mitchell,  1978). 

Atkinson  and  Federov  (1975)  introduced  sequential  optimal  designs  in  the  case  oF 
normal  errors  with  constant  variances  for  discriminating  betvnwn  two  models.  To  de- 
tect lack  of  fit,  Jones  and  Mitchell  (1978)  used  the  A}-optimality  criterion  introduced 
by  Atkinson  and  Federov  (19'^).  Wijesinha  and  Kliuri  (1987)  extended  this  idea  to 
a mulciresponse  situation  to  obtain  optimal  designs  sequentially.  Also,  some  work 
was  done  using  Bayesian  inetliuds  in  this  area.  FcIsGiistcin  (1992)  derived  optinml 
Bayesian  designs  to  discriminate  between  two  models.  In  Chapter  Five,  sequential 
and  Bayesian  procedures  for  detecting  lack  of  Fit  of  the  assumed  model  are  considered 
using  the  criteria  by  Jones  and  Mitchell  (1978). 

In  summary,  the  following  topics  are  addressed  in  this  dlssertacion: 

(1)  Chapter  Two  focusses  on  extending  the  standard  GLM  fitting  procedure  using 
second-order  approximation  of  a function  of  the  response. 

(2)  A lack  of  fit  test  (for  heterogenous  variantea)  and  its  robustness  are  discussed  in 
Chapter  Three. 

(3)  Determination  of  the  optimum  response  under  GLM,  using  a modification  of  the 
modified  ridge  analysis,  is  introduced  in  Chapter  Four,  and  finally, 

(4)  Chapter  Five  gives  several  optimal  designs  for  geiieralizixi  linear  models. 


CHAPTER  2 


A MODIFICATION  OF  TIIF,  STANDARD  GLM  PARAMETER 
ESTIMATION  PROCEDURE 


2.1  iMt.roHLr.r.iiiii 

Suppost*  Vi.  • • - 1 Uf.  iiidpppmlenily  ilisiribuieil  airanHiig  to  ru  expoiiRntiftl  fam- 
ily distribution  with  E{y,)  = (i,,  I'nr(y,)  = it®,  i = 1 n.  Then,  V'nr(i/)  s=U  ^ 

diag(ff?,.  ...IT®)  where  y = (pi a„)'. 

A generalized  linear  model  is  specified  by 

’!  = «'(zi xt.p]  (2.1,1) 

where  0 is  an  unknown  parameter  wior  and  4 is  iinear  in  tiir  eieinenl*  of  0 and  can 

be  expressed  as  f'{x)0,  wliere  x = (j, tj,)'  and  /'(*)  i-s  a vector  of  powers  and 

cross  prodiirtaof  powers  of  thei,'s  j = ],..  .,k  (wiiere  the  jr,’s  are  coded  valuM  of  it 
input  factors).  Tire  component  i?  is  referreil  to  as  a iinear  predictor.  Tire  systematic 
coniponejit  r;,  = f'{xi)0  i*  related  to  tlie  mean  of  the  i-lh  response  through  tile 
link  function  rj,  = j(p,)  wliere  xj  = (z,i,.  ...ij,)'  represents  the  design  setting  at 
the  1-tb  point  (i  = 1, . . . , n).  In  the  ataiidarri  GLM  fitting  procedure,  a first-order 
npprnximation  of  9(1/,)  is  riefiiierl  as  follows: 
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In  this  cliftpter,  a second-ordpr  approximation  of  j7(v,}  using  two  different  methods 
will  be  consirirred.  Tlie  two  methods  will  be  coiiLpared  with  regard  to  efficiency  of 
the  estimates  of  the  unknown  parameters.  Section  3.2  ^vr«  the  expressions  for  all  the 
estimates  and  their  variances  for  the  standard  method  while  in  Sections  2.3  and  2.4 
their  modifications  using  the  two  melhuds  are  developed.  In  Section  2.5  a comparison 
is  ntade  of  all  the  three  methods  for  a Poisson  response.  A similar  comparison  is  made 
in  Section  2.6  concerning  a binary  response.  Finally,  in  Section  2.7  an  example  is  given 
to  demonstrate  the  derived  results. 

2.2  Standard  GLM  Parameter  Estunaiion  Prneediire 
In  a standard  GLM  we  have,  from  (2.1.2),  E(z,)  = pfp.)  and 


which  provides  an  approximatinn  of  V’nr[p(vi))  for  i = 1,  ...n.  In  this  case,  it’  is  either 
a constant  or  a known  function  of  p,. 

The  fitted  model  is  written  in  vector  form  as 


where  X = and  E{e)  = 0,  V'nr(e)  = W = diag(u;|, .. . ,tv,). 

The  estimation  of  Q is  done  as  follows  ; 

We  start  vnth  an  initial  estimate  of  0,  say  0o-  Let  t|s,  = f'[xi)0o  and  consequently 
Poi  = Using  the  value  of  p„,  and  assuming  s*  tu  be  the  first  iterated  value 


* = + e 


(2.2.1) 


2ft  = sW)  + Ivi  - 


of  s,. 


From  the  expression  of  u>,  it  is  clear  that  It  is  a constant  multiple  of  a function  of  (t,. 
Let  IVo  = diag(uioi,.  ...tUft.),  where  tim,  is  the  value  of  w,  at  the  first  iteration  which 
is  calculateii  using  po,  and  <r^  (whirh  is  either  a function  of  ph  or  a constant).  The 
vector  /3i  is  obtained  by  regressing  *5  = (201, -tifti)'  on  the  explanatory  variables 
using  a weighted  least-squares  procedure,  i.e. 

ax^(X’Wi'X)-'X'Wl'z„ 

Tills  procedure  continues  in  a similar  manner  using  /?]  as  an  estimate  of  B for  a 
second  iteration  step  until  estimates  of  0 in  two  consecutive  stops  are  sucli  that 
l^m.i  “ where  d is  a small  pceitive  number.  If  after  the  mtli  iteration,  the 

process  converges  then  we  write 

= (2.2.2) 

and  Var(jS)  is  approximately  estiinaied  by 

Va7{0)  ~ (X'lV'jf)-",  (2.2.3) 

where  W - W„,  = diag(u)mi, . . .. »„„)  is  the  estimated  value  of  W at  the  mth 
iteration. 

2.3  Method  I 

In  this  metimd,  we  consider  Pi,.  to  be  indepniidently  distributed  according 
to  an  exponential  family  distribution  witli  E(y)  = /i  = (pi  and  V’or(j/)  = U 

as  before.  The  link  function  is  also  the  same  as  in  standard  GLM  i.e,  ij,  = j()i,)  for 


1.  HowHver.  9(5/,)  is  approximalfd  by  a second-order  model  of  tlie  form 


a.  = S(«)  + (».  - + 


2 dll?’ 


(2.3.1) 


In  this  case, 


Tlius, 


= 0('d)- 


(2.3.2) 


where  9'(^*)  is  a known  function  of  /r,  and  hence  a function  of  Tji.  In  some  situations, 
a]  depends  on  jr,  and  in  other  situations  may  be  just  a constant.  Furtliermore, 

Var(s.)  = <i,V(»'.)l’  + ^(fl"(Mi)l*V‘i'’l(!/i-/h)*]+s'(/ic)/(/i,)£I(li.-w)’l 

which  provides  a second-order  approximation  of  Vnr((i(yj)j.  Note  that  9'(^ii)  ik- 
We  now  have  a different  response  with  different  mean  and  variance.  We  shall  use  the 
same  model  form  to  fit  the  new  data.  The  fitted  model  is  therefore  of  the  form 


I = Xfl-  + e- 


(2.3.3) 


E(e')  = 0 

far(«*)  = IV  = dlag(ie:, . . . .ur;) 


and  X0'  = s*(/i)  , the  vector  of  fl’(/t,),i  = 


The  iterative  procedure  for  this  method  will  be  as  follows  ; 


Wf  Ktart  with  an  initial  estimate  ot  0’.  Tlip  value  of  is  then  found  by  applying 
liie  Newtoii-Raphsou  method  {Agreati  1990,  Section  AJ-2,  p.  114)  to  the  equation 

At  the  (m  + l)8t  iteration,  we.  have, 


— % 


/i(OoaM) 


(2.3,4) 


where  and  ijo.m+ii  are  the  iterated  values  of>h  at  themth  and  (m+l)st  iterations. 
After  the  first  iterative  value  r)^  of  7,  is  found,  the  values  of  pn  = p~'(r?w),  aox,  and 
ur^  are  obtaiued  i s 1,. . .,  n.  Nern.,  2d  — (aoi, . . . , Zoe)‘  ia  regressed  on  the  input 
variables  and  ^ is  found,  which  is  again  used  in  the  Newton-Raphson  metliod  with 
/i(tji)  = 0(i?i)-/'(2i)i0J  = 0 i = 1,-..  ,n.  Thi-s  leads  to  171,  and 711,  and  ronseqiiently 
2i,.  The  iterative  procedure  is  continued  until  - 0f„\  < S.  where  d Ls  a small 

positive  number,  tn  this  case 


0‘  = 0'^^,  = fx'w;,-'xr'x'w;,-'z„,  (2.3.5) 


where  IV;;,  = IV  and  2„,  are  the  values  of  W and  z after  the  mth  iteration.  The 
variance-covariance  matrix  of  0 is  approximately  estimated  by 

ror(4-)  = (X'lV'-’X)-'  (2.3.6) 


2.4  Method  II 

In  this  method,  all  tire  assumptions  of  the  standard  GLM  are  considered,  namely 
being  independently  distributed  a.s  an  exponential  family  with  E{y)  = n 
and  Vnrty)  = U = diag(fff, . . . , oj).  The  systematic  component  tj,  = f'{Xi)0  Is 


!3 

related  to  the  link  Tunction  by  i),  = Again,  as  in  Method  I,  we  consider  the 

second*order  approximation  of  p(y,),  i.e. 

So,  £(si)  = and 

Var{z,)  ~ <»?|s'(»A)l’  + ^(9'(rt)ri'n>-[(l/i-«)’|  + fl'(/ti)jl"(A)£l(tf.-rt)’) 

However,  unlike  the  previous  method,  here  the  model  fitted  is 

z = 4'{X0)  + €’  (2.4,2) 


Vor(e-)  = W 

So,  ill  this  case  E(z)  = ij>{Xp)  = (p{jj)  (since  tj  = X0],  where  ‘t’iXff)  is  a vector 
whose  ith  element  is  ii{f{x{)0),  i = 1,.  ..,n.  The  variance  w‘,  is  a function  of  <1, 
which  in  turn  is  a function  of  tj..  Since  ij,  = /'(*i))3.  m,"  is  a function  of  0.  Therefore 
we  can  write  V'ar(«")  = W = V(^),  where  V(^)  Ls  a matrix  of  functions  of /3.  It 
is  assumed  that  tlie  functional  form  of  V is  known.  For  the  estimation  of  0 in  Model 
(2.4.2)  we  use  a non-linear  least  squares  estimation  procedure.  Using  the  generalised 
least  squares  method  as  described  in  Seber  and  Wild  (1989,  Section  2.1.4,  p.  27  and 
Section  2.8.8,  p.  88),  we  obtain 

flm.,1  =0,,  + - d>(X3„)l  (2,4.3) 

where 


„„  . 54m 


14 


f (")  = F0„j 

V‘~*  = V(3„),  and  r = U, s.)', 

Th(!  iteration  procedure  is  as  follows; 

We  consider  an  initial  value  9„  of  0.  Then  ^ = f'{xi)0„,  /io,  = s“'(’7ih)  and 
V'"’  = V(l3„)  are  found.  In  the  next  step. 

B,-9c,  + - <^(Xi35)| 

The  procedure  is  continued  until  -j3,„|  < S,  where  d is  a small  positive  number. 
An  estimate  of  the  asymptotic  dispersion  matrix  of  3 = 3m+n  approximately 

V'o7(3)  = (F’VV‘-'f)-'|^  (2.4.4) 

where  F = F*""*  and  W'  = V'<”“. 

The  criterion  for  comparing  the  standard  and  the  two  modified  GLM  htting  pro- 
cedures  is  based  on  the  varianres  of  the  predicted  mean  responses.  We  consider  the 
two  cases  where  Vi^  • . . >Vn  are 

(1)  Poisson  variatejr  with  means  y,,...  and 

(2)  Bernoulli  variates  with  success  probabilities  pi,. 


Suppose  y = (vi , . . . , p„)',  where  j/i’s  are  inilependent  such  that  y,  has  a Poisson 
distribution  with  mean  ji,  (i  = I, . . . .ti).  Then  Fly)  = y = (pi , - . . , (in)',  = 


The  link  runninii  when 


distribution  is  Poisson  Is  i),  = 


log(ji,]  Tor  i s 1,.  ...71.  Using  the  staiidnrd  GLM  method 

Thus.  E(z.)  = log(w)  and  V'nr(c.)  = ^ = m.- 

After  httiiig  model  (2.2.1)  we  get  the  estimate  /3  and  V'ar(3)  as  given  In  (2.2.2) 
and  (2.2.3).  respectively.  We  know  that. )!,  = e*'  = e/'(®0^. 

Tlierefore,  the  variance  of  /h  is  approximately  given  by 

Var(A.)  =7  /t?/'(Xi)(X'IV-'X)-7(»0  (2-5-1) 

In  Methods  I k II 


Var[(p. -rt)’l  = E(V.  - (A)' - [£^(Vi  - 
= 3p?  + M.  - /<? 

= 2/^’-t-/i, 

Therefore.  E(z,)  = log(;7,)  - = log(rt)  - = g'{in)  (from  2.3.2)  and 


In  the  rase  of  Method  I, 


9'M  = f'{Xi)0‘ 


after  fitting  the  model  given  by  (2.3.3).  Therefore  from  (2.3.6), 


Var(rM)  = /'(*,)(XW-'X)-'/(«i) 

Let  us  now  use  the  relationship 


Var-(M.)=(g)  Vor(9-(,«)) 


y'(<^)  = - 2^ 

then, 

V'oK/i,)  f'(z,HX'W‘-’xr'f(z<)  (2.5.2) 

In  Method  II,  from  tiie  expression  of  Kar(i3)  in  (2.4.4)  and  using  the  same  method 
as  in  standard  GLM  we  have  approximately, 

rar(M  = p?/'(xi)(/''W-'f  )-7(»0  (2.5,3) 

Let  us  denote  d(»?)  = («i(i7i) iiVn))'  where  ij(rii)  and  rj  to  bp  the  vector 

of  tji's  for  I = 1, ...,  n.  Then, 

FIBl  - ^ d<KT])dT] 

d0  80 


dv 


= £» 


Also, 


dy  . S(X0) 

30  30 

Ttierfforp  we  can  write  F(0)  = DX  and 

FW-'F  = X'lXW-'DX 

where  W = diag(in;, . . . , mj)  and 

D'-'WD-'  = D-'‘W 


aince  both  D and  W are  diagonal  matricee  and  the  rth  diagonal  eletneni  of  D ie 
I + jje  s=  1 + = e^fw*'  (*  “ So,  in  case  of  Metiiod  IT, 

Var{M=:i^/'{B,){X'D‘W-'DXr'f{ti)  (2.5.4) 

Let  ne  denote  tite  varianeee  of  tiie  predicted  meana  /%  for  the  standard,  Method  I 
and  Method  II  by  Varg{fti),  Var/()i,*)  and  Varii(iii)  reopectlvely.  FVom  (2.5.1), 
VarsOir)  =r  ^f[x,][X'W'X)-'f{xi) 

Fhom  (2.5.2), 

= (i;^)  f'(^i)ix'w-'x)-'f{x,) 

and  hnaliy  from  (2.5.4), 

Var„(.M  = ,.tf{x,){X'D'W-'DXr‘{(x,) 


r[x,]{X'[^^,D-'WD-'r'X)-'f{xt) 


LEMMA  2.1:  Let  A and  B be  positive  definite  matrices  then 


A > B {<=>  x'[A  — B]x  > Ofor  all  x > 0) 


PAf"  = / and  PBP'  = A 


A>B 

=>  p-'p'-i  > p-’AP'-' 
=>  / > A 
=>  I < A"‘ 

=>  P'P<P'A-'P 
=>A-'<fl-'  Q.E.D. 


=>A-'<B-' 


Proof:  There  Is  a nonsingular  matrix  P such  that 


Then, 


Let  us  now  compare  the  standard  method  with  Method  I.  It  needs  Co  be  shown 


Vors(Ai)  > VariiJii) 


x'i,;wr'x<x'i{^^'jwvx 


A sullicienc  condition  for  this  inequality  to  be  true  is 


or,  by  lemina  2.1, 
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since  IV  and  IV*  are  diagonal.  Wp  ronsider  the  j-th  diagonal  element  of  and 
j-th  diagonal  element  of  other  words,  j-th  diagonal  element  of 

W and  j-th  diagonal  element  of  W^*.  Thus, 

Var,(A)  < Vnrs()'»,) 

if  the  j-th  diagonal  element  of  W > j-th  diagonal  element  of  fot  t**!  J 


(aT)’ih^-sl 


■**<3  Ihi 
4)1.  -I- 1 ^ 1-2)3, 


Hence,  Vor)()i.)  < Vars()i,)  if 


(2.S.5) 


Let  us  suppose  maxi^<„  is  attained  at  j = jo.  Now,  > 0 for  all  )),  > 0. 

Also,  maxi<j<,i^^  = < 0 for  ii„  > .5.  Therefore  Method  I is  better  than 

the  standard  method  if,  for  all  j = l,...,n,  )ij  > .B. 

To  com|)are  the  standard  method  with  Method  II,  it  is  sufficient  to  compare  the 
matrices  )i?W'  and  IV* D“''  (using  the  lemma),  which  is  equivalent  to  compar- 

ing IV  and  D~'W‘D~'‘.  More  specifically,  Vnr)/()i,)  < Vars(ji,)  if  D“‘lV*j5"'' 
< IV.  Both  of  these  matrices  are  diagonal  with  j-th  element  of  fV  being  tir,  = l/)ij 
and  j-th  element  of  D"'IV*D"''  being  tej-  Tlierefore  we  ran  say  that. 


< i-th 


«(*,)■>• 


‘(iSi)’”.- 


(s?h)  ""'-(iItt)  * 


Method  I ie 


than  Method  II  if 


' 2/^ 

,2)1. + 1 


which  is  true  ilT 


(2-5.7) 


If 


for  some  jJ,  1 < jj  £ 'll  (2.5.7)  is  satisfied  if  (i,j  = The  above  is  true  for 
estimating  just  for  a fixed  i.  So,  in  terms  of  the  variances  of  the  predicted  nteans. 
Method  I and  Method  II  are  not  comparable  except  for  the  case  when  h,  < 0.5.  In 
that  situation  Method  II  wilt  be  preferred  over  Method  I. 


Let  us  next  roitsider  n Bernoulli  variates  y„  with  probabilities  pi, . . . ,p„. 

Then, 


E(»)  =P<  Vnr(j/)  =diag(pi(l  -pi),....p„(l  -p„)) 


Here  the  link  function  is 


for  i = l,...,n.  For  the  standard  GLM  procedure 


So,  E(zi)  = log  “>'1 


Aftn-  Rltlng  the  model  (2.2.1)  we  get  the  expressioi^s  for  0 and  Var(S)  as  given  Id 
(2.2.2)  and  (2.2.3)  respectively.  We  know  that,  p,  = = h(^)  and  ft  = f'(xi}0. 

Therefore  we  have. 


Varip,)  = V'or(/i(ft))  j V'ar(ft) 

= J?(l-p.)V'(*i)(X'H'-'X)-7(*f) 

In  ease  of  Methods  I and  II 


(ki-y.)  («i-p.)^(2p.-l) 

P,(l-P.)  2p?(]-p,)2 


(2.6.1) 


£(lfi-I>.)®  = M.(9.-P.) 


Vaf  [(»■  - Pi)’]  = 3pj5,*  + p,?.(l -6p,9i) -p?j? 
= P.9.  - 4p?9? 


Therefore,  B(z.)  = p'(p,)  = log  = log  (-J^)  + 


„„  _L  (2p,  - l)(‘tp.Pi  + 4piq,’-  12ffp,  + 2p,  -1)1 
Pi9i  1 4ff??  J 


In  the  caw  of  Method  I. 


g'ip.)  = 

after  fitting  the  model  given  by  (2.3.3).  Therefore, 

Vnr(s>,))  = flx,)[X'W-'Xr^fM 
We  can  use  the  relationship 


Var(fi,)  nr 


(t)  l'.r(i-ta)| 


= log(Pi/{l  - Pi))  + 


and  thus, 


(2-6.2) 


In  Method  II,  from  the  expression  of  VfiT{0)  in  (2.4.4)  and  using  the  same  pro- 
cedure as  standard  GLM,  we  have, 

Vor(p,)=rp?(l-p,)V'(®0(F'»'-'F)-‘/(»i)  (2.6.3) 


Here. 


Now  d(»|i) 


d0  9ij  d3 

- - • * sa  -»+■&*•  ’■I"*-., 

dUn,)  I ' + if  i = j 

3ij,  \ 0 otlierwisp 


Using  the  same  tedini<|Ue  as  in  the  Poisson  case  we  have. 
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We  tan  write 


£2 

d0 


X 


FW-'F  = X'D'W-'DX 


So,  in  case  of  Metliotl  II, 

V«rlji.)^^{\-p,fr{xi){X-D'W-'DX)-'f{Xi)  (2.6.4) 


Let  us  denote  tlie  variances  of  the  predicted  means  p,  for  the  standard,  Method  1 
and  Method  II  by  Vnr,e(^),  V'nri(p,)  and  Vnr„(^)  respectively.  From  (2.6.1), 

V'ors(R)  cr  ^0-p.ff(x>){X'W-'Xr'f(x>) 

= -P.)*)W|-‘X)-‘/(*i) 

From  (2-6.2), 

V'or;(p.)  ~ (2,^(1  -p.fffMiX'W-'Xr'fM 
= /'(iCi)(.X’[(2p?(l  -p,)’)’W]-'X)-7(*i) 

and  finally  from  (2-6-4), 


Var„(ft)  =-  ^(l-p,)V'(«r)(X'D'W-'Z)X)-’/(*i) 

Let  ns  start  the  comparison  of  the  standard  method  with  Method  I,  As  in  the 
Poisson  case,  using  Lemma  2.1,  we  consider  tlie  j-tli  diagonal  element  of  j^(l -p,)*W' 
and  the  j-th  diagonal  element  of  (2j^(l  - p,)*)’W'*  or,  in  other  words,  the  y-th 
diagonal  element  of  (1  - p,)“W'  and  the  j-th  diagonal  element  of  (2p,(l  - p,)')*W. 


Vnriip,)  < V<irs(ft)  if  for  all 


i-th  diagonal  elemeiil  of  (2pi(l  < j-lli  diagonal  plnm«nt  of  (1  - p,)'W 


(2pi(l  [.  (2p,  - l)(‘lPj?j  + 4pjgf  - 12p;q,  + 2p,  - 1)1  (1 

P,9)  1‘  “hjOj  1 P>9) 

r,  , (^Pi  - l)l*Pi^  + *Pi9i -12pf^  + 2Pj-  ')1  , (1-P.)^  o 

4p’«?  J (2ft(l ^ 

r (2p,  - l)(4pjgj  + 4pj^-12p^a  +2pj-  1)1  [1  -p,)’' 

4fji^  J (2p,(l -p,)2)’ 

Let  ja  be  the  value  of  j where  the  maximum  is  obtained,  that  is, 

[■  (2pj  - l)(4p,?j +4pj^;  - 12p?2,  + ^, -1)1 

isjS"  I 4i^gf  J 

_ r,  ^ (2px  - l)(4pj»?J.  + ‘>Pk41  - + ZPi.  - 1)1 

"I  J 

for  some  1 < j„  < n.  The  inequality  [l  + < , 

holds  for  .1464466  < Pj,  < .8535534  and 

Therefore  Method  I is  belter  tliaii  the  standard  method  if  for  dl  j,  .1464466  <pj< 


To  compare  the  standorri  method  with  Method  II,  we  consider  the  matrices  pf(l  - 
!i,)‘^lV  and  pj(l  - p.)®i?"' WD"'‘,  or  equivalently,  W and  D~'W’D''‘.  Both  of 
these  matrices  are  diagonal  with  j-th  diagonal  element  of  W being  tiij-  = l/py(l  -py) 
and  y-th  tiiagonal  eiement  of  £)“' W*D“*'  being  4pJ(l  — Pj)®ufJ.  Therefore  we  can 
say  tiiat,  Vari,{p,)  < Vnrsfp,)  if  for  all  j = 1,..  ,,ri 

j-lh  diagonal  element  of  £>"'  W‘D~^'  < j-th  diagonal  element  of  W for  all  j i.e. 


(2p, -l)(4p,gj  + 4p,<)?-12pj?j  + 2pj-l)1 

■•pJ??  J 


I 


+ (2fij  - l)(4p,<l,  +4p,qJ  - + 2p,  - 1)  < 1 


/«(P,)  <1  V J 


where  At?,)  = 4p5«j  + (2p,  - l)(4p,?,  + 4p,9j  - 12;^?,  + 2pj  - 1) 

II  is  elear  that  /s(pj}  < 1 for  all  Pj  > 0.  So,  for  all  the  Bernuulli  processes  Method  II 
is  better  than  the  standard  method. 

Let  us  now  compare  Metliod  I and  Method  II  for  the  Bernoulli  prchlein.  Here  we 
have  to  consider  the  mat, rices  (2j^(l  - p.)*)W*  and  ;^(1  - p.)’D-'  WD-"  which 
is  same  as  coinparin}!  4;^(1  -p,)’lV*  and  D~'W‘D~''.  The  j-th  diagonal  element 
of  4p?(l  -p,)’lV'  is  4;^(l  -p,)’uij‘  while  the  j-lh  diagonal  element  of  D~'W‘D~" 
is  4j^(l  - Pj)*«j‘. 

Method  I is  better  than  Method  II  if 


for  some  jo,  1 < jo  5 n,  theti  (2.6.5)  is  satisfied  if  either  P^  “ p,  or  P;n  = 1 — p,-. 
Method  I is  worse  than  Method  II  if 


‘*1^(1  -Pi)’«o/  <4p’(l  -p,)’u>j‘  V j 


which  is  true  If 


4j^(l  -p.)’  < ,mi|i__4j^(l  -Pj)’ 


(2.6.6) 


If 


^min__4pj(l  -pj)’  = 4pj,(l  -p,,)’ 


4;^ (I  - Pi)’««,'  > 4pJ(l  - Pjj’up,-'  V j 


which  is  true  if 


4p?(l  -Pi)’> 


(2.6.6) 
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4pJ(l  -P,)“  = 4p^(l-Pi;)’ 

for  some  jJ,  1 < ;u  S ’'i  (2-6.6)  is  satisfied  if  Pj-  = p,  or  p^j  = 1 - p,.  The 

above  is  true  for  estimating  just  p,  for  a fixed  i.  So,  iu  terms  of  the  variances  of  the 
predicted  means.  Method  I and  Melliod  II  are  not  romparabie  except  for  the  case 
vrheii  Pi  < .1464466  or  p,  > .8535534.  Method  II  wiil  be  preferable  to  Method  I in 
that  situation. 


2.7  Example 

In  tills  section,  we  compare  tlie  tiiree  methods  using  the  popcorn  example  given  by 
VUiing  and  Kiiuri  (1991).  In  tbis  example,  the  response  is  the  number  of  unpopped 
kernels  (defects),  beginning  with  a ^th  rup  of  unpopped  popcorns.  The  variables 
under  the  experimenter's  control  are: 

burner  setting  i],  amount  of  vegetable  oil  ij  and  popping  time  ij. 

Tiie  initial  number  of  kernels  was  tiie  same  for  cacli  combination  of  xx.x^,  and  xj.  A 
Box-Belmken  (1960)  design  witli  tiiree  center  point  tepUcatis  was  used  and  a second 
order  model  in  i,,  xj  and  ij  for  tiie  linear  predictor  was  fitted.  The  response  being 
the  number  of  defects,  a log  link  was  assumed.  Tbble  (2.1)  gives  the  actual  levels  of 
the  input  variables  while  l^ble  (2  J)  gives  the  design  and  tlie  response  values  for  the 
experiment.  In  Tables  (2.3),  (2.5)  and  (2.7)  are  listed  the  parameter  estimates  and 
their  estuiiated  standard  errors  with  each  of  the  three  different  methods.  Values  of 
the  predicted  responses  (A)  and  tlieir  estimated  variances  (Vor(A) ) are  presented  in 
Tables  2.4,  2.6  and  2.8  for  the.  standard  GLM,  Method  I and  Method  II,  respectively. 


TABLE  2.1 


Actual  levels  for  the  iuput  variables 
m tbe  popcorn  experiment 


Burner  setting 
I (couC.  scale  of  0-10) 


Amount  of  vegetable 
oil  xa  (tablespoons) 


TABLE  2.2 


Design  aud  response  values  [number  of  inedible  kernels 
among  jtli  cup  of  unpopped  kernels) 


TABLE  3.3 


Parameter  (atiniates  aud  their  standard  errors 
fnr  ^rmiHard  GLM 


TABLE  2.4 


Predicted  and  the  observed  rcspouaes  (number  of  inedible  kernels) 
and  tlieir  estimated  variances  for  stoudard  GLM 


TABLE 


Pammptt-r  gstiinatcg  aud  tlieir  standard 
for  Method  I 


TABLE  2,( 


Predicted  aud  the  observetl  respoxuies  (uumber  of  iuedible  kernels) 
aud  their  estimated  variances  for  Method  t 


TABLE  2.7 


:ir  ataudard  errorg 


for  Motliod  11 


TABLE  2-f 


Predicted  and  the  observed  respomics  (number  of  inedible  kernels) 
aud  their  estimated  variaiices  for  Method  11 
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The  valuer  of  the  parameter  eatimatea  from  all  the  three  nrethods  are  close  to  eacli 
other.  The  values  of  the  estimates  of  the  standard  errors  or  »d(ft)  are  quite  close 
vrith  Method  I and  the  standard  method.  But  the  values  of  sd(3i)  for  Method  II  are 
slightly  smaller  than  those  obtained  from  the  standard  method.  On  average,  there  is 
a 3>9%  reduction  in  the  standard  deviations  with  Method  II  when  compared  to  the 
standard  method. 

The  values  of  the  predicted  mean  responses  [[k)  at  all  design  points  are  quite 
close  for  the  standard  and  Method  I while  for  Method  II  they  are  sliglitly  higher  in 
some  cases.  The  generalized  Pearson  goodness  of  6t  statistic  is 
calculated  for  each  of  the  three  cases.  The  results  are: 


Standard  method 
Method  I 
Method  II 


16.496 

16.907 

17.836 


while  Xi<  os  ~ 73.68.  So,  it  is  noticed  that  all  the  tliree  methods  6t  the  data  quite 
vrell.  Since  /2,’s  are  all  greater  tlian  0.5,  both  Mntliod  I and  Method  II  wiil  both 
give  more  precise  estimates  tiian  the  standard  method.  As  for  Vor(/(,),  Method  I 
produced  smaller  values  than  did  the  standard  method.  In  case  of  Metiiod  II,  for 
i = 7,3,5  and  8,  the  estimated  values  are  not  small  due  to  the  experimental  error 
associated  with  the  estimation.  Tlie  large  predicted  p,’s  also  have  larger  V'nr(p,), 
since  from  (2.5.4),  Knr(/i,)  is  directly  proportional  to  pj  . 


CHAPTER  3 


TESTING  FOR  LACK  OF  FIT  IN  THE  PRESENCE  OF 
HETEROGENOUS  VARIANCES 


3.1  liit.rocIi]cl;ioii 


III  Chapter  One,  goodness  of  link  ttats  were  mentloiied  for  generalized  linear  mod- 
els. In  Pregibon's  (1980)  procedure  the  test  of  adequacy  of  the  link  function  was 
perfoniied  assuming  a specified  family  of  alternatives.  Hosmer  and  Lemeshow  (1980) 
suggested  a goodness  of  hi  test  for  logistic  regression  model.  The  standard  lack  of  fit 
test  deals  with  responses  which  are  nonnally  distributed  with  equal  variances  (Khuri 
and  Cornell,  1990.  p.9G). 

In  this  cliapter,  a lark  of  fit  test  is  propivsed  for  the  systematic  component  of  a 
special  case  of  GLM  where  the  response.s  are  normally  and  independently  distributed 
with  lieterugeuous  variances. 

Suppose,  a nitidel  of  the  form 


is  assumed,  where  X is  a known  n x p matrix  of  rank  p,  /3  is  a vector  of  p iinknowu 
parameters,  E ~ N(e,  S),  e being  a n x I vector  aiiri  £ has  the  form 


Y = X0  + 


(3.1.1) 


(3.1.2) 
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(3.1.2) 


III  Section  3.2,  a test  statistic  for  lack  of  fit  Is  given  when  E is  known,  while  in  Section 
3.3  an  approximate  test  is  prcsenletl  for  tlie  case  when  E is  nut  known.  Section  3.4 
gives  examples  to  show  the  accuracy  of  the  appro.ximate  test  statistic. 

3.2  Calculation  of  the  Test  Statistic  for  fail,  nf  Fit  (E  Knownl 

To  discuas  the  lack  of  fit  test,  we  assume  that  the  true  mean  response  n satisfies 
a model  of  the  form 


where  the  matrix  X,  contains  as  elements  cross-prudurt  and  power  terms  involving 
the  design  settings  of  the  control  (input)  variables  and  0^  is  a vector  of  parameters 
associated  with  the  cross-product  and  povrer  terms.  In  formula  (3,2.1),  Xj/S,  repre- 
sents the  portion  of  the  true  model  not  used  in  the  fitted  model.  We  discuss  both 
cases  when  the  variance  covariance  matrix  is  known  and  unknown. 

Let  us  consider  fitting  a model  not  containing  0\.  If  E is  known,  vre  can  transform 
the  data  vector  Y and  tlie  matrix  X as 


= E(V)  = X/3  4-  X|i3, 


(3.2.1) 


y.  ^ 


X-  = E-'''X 


The  transformed  fitted  model  will  tlicii  be  of  the  form 


V = X-0  + e.- 


(3.2.2) 


where  W(eM)  fore’  = X;/3, , X;  = S-''’X,. 


Since  y ~ y ~ N(ii' ,1),  n' - X‘^  + 


ssE  = yy  -y‘x‘(x‘'x’)-'x-'Y- 


= y [s-' - E-'x(;ir's-'x|-'x'E-']y 

= YAY.  where 

A = -S-'X(X'S-'X)-'X'E"'.  The  mean  of  SSE  can  be  calculated  using 

the  formula  in  Searle  [1971,  p.  55]  ae 
E[5S£)  = + (r(AE) 

= (^X'  + 13[X',)>1(X3  + X,/3,)  + (r|J,  - E-'X(X'E-’X)-‘X'| 

= (^X'AX/3  + 2^;X',>lXl3  + ^,X;/4X,/3,)  + (r(I„  - 1^)  (3.2.3) 

ffX'AX0  = /9'(X’E-’X  - X'S-'X(X’S-'X)-'X’2-'X)(9 


lf,X\AX0 

= ff,  (x',e-'x-x;e"'x(x'E''x)-'x'e-'x)]9 
= (x;e-‘x -x;e-'x)0  = o 

Tlierefore  E{SSE)  = /yiXlAXifli  + (n -p). 

Theorem  (Searle,  1971,  p.57); 

Let  Y ~ 7V(p,  E)  with  fi^O.  Then  Y'AY  has  a non-central  distribution  with 
r degref*  of  freedom  and  a non-cenirniity  parameter  A = if  *nd  only  if  AE 

is  iriempntent  of  rank  r = rank(A). 


Here,  AE  = /„  - E-'-X-(X'E-’X)-'X'. 


(AE)(AE) 

= /„  - £-'X(X'E''X)-'X'  - E-‘X(X'E-'X)-’X'  + 
S-'X(X'E-'X)-'X'E-'X(X'E-'X)-'X' 

= /„-E-'X(X'E-'X)-'X' 

= AE. 


A = iji'Afi 
= i/3;x;AX,/3, 

and  rank(A)  = rank(AE)  = tr(AE)  = n-p.  Hence,  by  the  above  theorem,  SSE  has 
the  non-central  A*  vrith  (n  — p)  df  and  non-rentrallty  paraaneier  A. 

Under  the  null  hypothesis  Hq:  AXi/3,  = 0, 

T = SSE  = Y‘  [E-'  - E-'X(X'E-'X)-‘X'E-']  Y ~ 

We  reject  Ht,  at  level  a'  if  observed  T > • 

3.3  Aouroainiate  Test  Statistic  for  Unknown  E 

In  most  practical  situations,  E is  unknown.  For  these  cases,  vre  develop  an  approx- 
imate teat  ba.aed  on  estimating  E,  where  E is  assumed  to  be  diagonal  with  diagonal 
elements  nut  necessarily  etpial. 

Suppose  thal  the  design  used  to  ht  the  mt>del  has  m distinct  design  points  with 
n,  replications  at  each  point  i=  1, . ...m  and  n,  = n. 


Then  E has  the  form 


(3.3.1) 


If  an  efitimaie  of  a’f  is  |iv»n  by  d?  = where  K,  is  the  j-th 

observation  at  the  i-th  point,  i = and  j = l,...rn,  and  ^ is  the  average  of  n, 

replicated  observations,  then 
£ can  he  estimated  by. 


£ = 


(3.3.2) 


Defiiiitiuri:  A inatrbt  4„  is  said  to  converge  to  A in  prohahiliiy  if  and  only  if  the 
corresponding  component-wise  convergences  hold  (Serfling,  1980,  p.6). 

LEMMA  3.1:  Let 

f = r'[E-'  -E->A-(x'E-'x)-'x’E-']r. 


r = 7'-hO,(i»innT‘'’) 

The  notation  ‘Os(^n)"  represents  a ratidoin  vartahle  sucli  that  for  every  d > 0,  there 
is  a coitstant  K and  an  integer  no  such  that 


Proof:  For  siiiiplirity,  let  us  denote  inin.ni  liy  ri,„„.  It.  is  easy  to  see  that 
d?  = o? + 0»(t>r'^)  for  all  i = l,...,m 
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Binte  (dj  - (7f)  Hu*  stamidrd  dfvialion  (Agrrati,  1990,  p.419).  Thus  using  thp 
Above  liefinition  we  can  say, 


= E + A, 

where  Ai  is  a diagonal  matrix  with  ilh  diagonal  element  'n  probability  tor 

i = l,...,m  which  can  then  be  written  as  So, 

t = Y‘  [£-’  - ±-'X(X'z:-'X)-'x'£-'|  Y. 

= r'l(S  + A,)-'  - (E  + A,)-'-3f(X'(E  + A,)-'X)-'X'(E  + A,)-’] V 

Let  us  make  the  following  expansLons: 

(E  + A.)-‘ 

= |E(I  + E-’A,)|-'  = (/+  E-'A,)-'E'’ 

= (/  — E“' A,  + higher  order  terms)S”' 

= £*'  — E“' AiE''  + ...  = E”'  + A;  say 
and 

|X‘(S  + A,)-'Jf]-'  = (X'(E-'  + A,)X|-’ 

= |X‘E-’X|-' - (X'E-'X|-'X'A3X[X'E-'X|-‘  + high«r  order  terms 
= (X'E''JC)-'  + A,  say 

where  Aj  = -E"' A,E’' and  A,  = -(X'E-'X)-'X'AjX(X'E-'Xl-'  areO,(n;;^) 
and  respectively.  The  above  expansions  are  valid  under  tlie  conditions  that 

(Graybill,  1983,  Chapter  5) 

(1)  |A,(E-'A,)|  < 1 for  all  i or  ||E-'A,|1  < 1 and 
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(2)  |A.((Jf'£-'X)-'X'i2X|l  < 1 for  all  i or  ||(X'£-‘Xr'X'A,X||  < 1 
when'  A,(.)  is  ilia  ilh  aigciivaliia  of  a matrix,  |X,|  Ls  ilir  modulus  of  A,  and  ||.||  is  its 
Eudidian  norm.  Usuig  thesp  reaulrs  T liecomea, 

V |(S-’  + A,)  - (£-'  + A,)X[(X'E-'X)-’  + A,1X'(E-'  + A,)]y 
= y'(E->  - E-‘X(X'E'‘X)-'X'E"’  {S-’X(X'E-’X)-'X'Aj 

+AjX(X'E-'X)-'X'E’'  + AjX(X'E-‘X)-'X'A,  + E-'XAjX'E’' 
•t-E-’XAjX'A,  + AjXAjX'E’’  + AjXAsX'A,))y 
= y'[E-’  - E-'X(X'E-'X)-'X'E-')y  + A. 

= T + A« 

whera  A,  = y'fA^  - {E-'X(X'E-’X)-'X'Ai 

.f  A2X(X'E-'X)-’X'E''  + AjX(X’E-'X)-'X’A,  + E-’XAjX'E’' 

* E-‘XAjX’A2  + AaXAjX'E*'  +AjXA,X'Aj)|y 

Is  of  order  Thus  the  rriiiral  values  of  f can  be  approximated  by  the 

critical  values  of  xi-,  under  tiie  null  hypothesis. 

34  Exaiiinie 

To  diedt  the  closeness  of  the  critical  values  of  T and  T,  we  consider  the  following 
example.  Let  us  assume  a first-order  model  as  a fitted  model 


i = The  true  mean  response  is  assumed  to  be 

Hi,  = iJo  + ^ ^ 8uxl  + ^ S 


and  Vsr(j).,)  = (7?,  j = i = 1, ....  m.  Wf  liavu  : 


before  fhat. 


^ = Xfl  + x^a, 

An  = AXa  + AXiBy 

and  under  Ha.  AXi/3i  = 0,  we  have, 

An  = Axa 

To  generate  V.j’s  from  we  specified  values  of  ^ = (5i, and  of,  i = 

A total  nf  50,000  response  vectors  were  generated  and  each  time  E and 
hence  T vrere  calculated.  The  tail  probability  of  t.  P\T  < t),  can  then  be  computed 
approximately  from  these  50,000  simulations  by, 

#t<T 

50, 000 

For  different  r,  these  approximate  percentage  points  are  then  compared  with  the 
percentage  points  of  xl-p  i-e-  PUn-p  ’"I- 

Alsu  to  find  out  whether  the  approximation  of  the  test  statistic  is  affected  hy  the 
valuas  of  of  we  have  considered  three  different  E matrices  having  different  degrees  of 
dispersion  in  the  values  of  of.  If  the  value  of  the  coefficient  of  variation 


is  high  then  E is  said  to  be  higlily  dispersed  and  if  the  value  is  small  then  the 
dispersion  of  the  diagonal  elements  of  E is  low.  If  rv=  0,  then  the  variances  arc  all 

Imtead  of  comparing  the  critiral  values  of  t and  T for  each  t and  for  each  com- 
bination of  n,'s  and  of's,  we  use  a measure  of  accuracy  of  approximation.  Lei  us 

*nJ  nW-fS 


If  iw  consider  s of  tlie  r values,  then  ns  a measure  of  approximation,  we  consider 


goodness  measure  = ^ E 

Therefore  the  smalier  the  values  of  the  goodness  measure,  the  better  the  approxima- 
tion of  tile  exact  values  by  the  percentage  points  of  T. 

The  first-order  designs  conaidpred  are  a 2*  factorial  design  with  center  point  repli- 
cates for  k = 2,3  and  4.  Different  rombiiiations  of  n,  and  different  sets  of  S were  aiso 
considered.  Ail  the  bracketed  entries  in  the  foliowing  tabies  are  the  standard  errors 
of  the  estimates  of  the  tail  probabiiities.  Aiso,  for  2°  factorial,  the  tail  probabilities 
for  both  T and  distributions  are  plotted  for  n,  = 2, 3,4, 5,6  and  cv  = 0.23  which 
are  displayed  In  figures  3.1-3.5. 


TABLE  3.1 


Desigu  settings  for  2'^  factorial  with  cetiter  point 
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TABLE  3.2:  Simulated  tail  probabilities  for  bnlauced  2’  factorial 


n, 

2 

3 

4 

5 

6 

X*  prob. 

0(0) 

0(0) 

0(0) 

0(0) 

0(0) 

.2 

,032(.001) 

0(0) 

0(0) 

0(0) 

0(0) 

,35 

C.V. 

,3G9(.002) 

-419(.002) 

.439(.002) 

.453(.002) 

.458(.002) 

.5 

:=.24 

.S6G(.002) 

.689(.002) 

.758(.002) 

.802(.OO2) 

832(.002) 

.65 

.707(.002) 

.844(.002) 

-901(.001) 

.933(.O01) 

.95l(.001) 

.8 

.841(.002) 

.946(.001) 

.975(.001) 

.988(0) 

.993(0) 

.95 

g.  meas 

.577 

.568 

.589 

.614 

.635 

0(0) 

0(0) 

0(0) 

0(0) 

0(0) 

.2 

.031(.001) 

0(0) 

0(0) 

0(0) 

0(0) 

.35 

c,v. 

.354(.IW2) 

.409(.002) 

.432{.002) 

.444(.002) 

.452(.002) 

.5 

=1.07 

.545(.D02) 

.673(.002) 

.745(.002) 

792(.002) 

.824(002) 

.65 

.684(.002) 

.823(.002) 

.892(.001) 

92G(.001) 

.947(.001 ) 

.8 

.818(.002) 

.934(.001) 

.971(.001) 

.986(.001) 

.991(0) 

.95 

g.  ineas 

.586 

.569 

.584 

.608 

.630 

0(0) 

0(0) 

0(0) 

0(0) 

0(0) 

.2 

.028(.001) 

0(0) 

0(0) 

0(0) 

0(0) 

.35 

C.V. 

.320(.002) 

.384(.002) 

.4!4(.002) 

.430(.002) 

.443(.002) 

.5 

=1.96 

.4S9(.002) 

.633(.002) 

.71G(.O02) 

•768(.002) 

.803(.002) 

.65 

.617(.002) 

.789(002) 

.862(.002) 

.907(.001) 

•931(.001) 

.8 

.752(.002) 

.903(.001) 

.952(.001) 

974(,001) 

.986(.001) 

-95 

g.  Dieas 

,684 

.580 

.577 

.596 

.615 

TABLE  3.3 


Dcaiju  Bettings  for  2*  factorial  with  center  point 
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TABLE  3.4;  Simulated  tail  probabilities  for  balanced  2^  factorial 


a. 

2 

3 

4 

5 

G 

prob. 

.003(0) 

0(0) 

0(0) 

0(0) 

0(0) 

.2 

.070(.U01) 

.065(001) 

.05G(.001) 

.026(.001) 

.010(0) 

.35 

.159(.002) 

.317(.002) 

.3G2(.002) 

.390(.002) 

.413(.002) 

.5 

=.23 

.253(.002) 

.526(.002) 

.G30(.002) 

.G98(.002) 

.74S(.002) 

-65 

.354(.002) 

.703(.002) 

.818(.002) 

.877(.0ni) 

.915(.001) 

.8 

.499(.002j 

.871(.002) 

.946(.001) 

.976(.001) 

-987(.001) 

.95 

g-  m«as 

1.36 

.511 

.487 

.532 

.578 

.003(0) 

0(0) 

0(0) 

0(0) 

0(0) 

.2 

.076(.001) 

.077(.001) 

.050(.001) 

.027(.00l) 

.010(0) 

.35 

e.v. 

.I75(.002) 

.282(.002) 

.335(.002) 

.370(.002) 

.390(.002) 

.5 

=.997 

.278(.002) 

.478(.002) 

.589(.002) 

.6G4(.002) 

.717(.002) 

.65 

.388(.002) 

.G46(.002) 

.777(.002) 

.648(.002) 

.894(.001) 

.8 

.551(.002) 

.824(.002) 

.922(.001) 

.961(.00l) 

-978{.001) 

.95 

g.  meas 

1.21 

.735 

.518 

.53G 

.573 

.003(0) 

0(0) 

0(0) 

0(0) 

0(0) 

.2 

.OTO(.OOl) 

.074(.00l) 

.050(.001) 

.027(.001) 

.010(0) 

.35 

C.V. 

.159(.002) 

.269(.002) 

.325(.002) 

-359(-002) 

.385(.002) 

.5 

=2.01 

.253(.002) 

-453(.002) 

.570(.002) 

.654(.002) 

.708(.002) 

.65 

.354(.002) 

.G18(.002) 

.755(.002) 

.835(.002) 

.884(.001) 

.8 

.499(.002) 

.792(.002) 

.902(.001) 

.951(-001) 

.974(0) 

.95 

g.  meas 

1.36 

.G52 

.533 

.522 

.572 

Figure  3.1:  not  of  x*  and  T-ha(  tail  probabilities  for  n,  = 2 for  2’  factorial 


Figure  3.2:  Pluluf^f*  andT-hat  Uil  probabilities  fur  n,  s3for  2’  factorial 


Figure  3.3:  Plot  of  g’  and  T<hal  tail  probabilities  for  n,  =4  Ibr  2'  fiiclorial 


Figure  3.4:  Plodifx'  andT-halUil  pnibabilities  for  n,  sSfor  2’  (actoriul 


Figure  3J:  Plot  ofx‘  and  T-hal  tail  probabilities  for  n,  s 6 for  2’  factorial 


TABLE  3.5 


Daigp  settings  for  2^  factoriai  with  center  poiut 
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TABLE  3-6;  Simulated  tail  probabilities  for  balanced  2*  factorial 


n, 

3 

4 

5 

6 

7 

X’  prob. 

.004(0) 

.001(0) 

0(0) 

0(0) 

0(0) 

.2 

.051(-001) 

.059(.001) 

057(.001) 

-049(.001) 

.042(.001) 

.35 

c-v. 

.15K.002) 

.223(.002) 

.273(.002) 

-309(-002) 

.337(.002) 

.5 

=.23 

.282(.002) 

.430(.002) 

.S34(.002) 

-615(.002) 

.672(.002) 

.65 

•446(.0C2) 

.642(.002) 

.764(002) 

-837(.002) 

•887(.001) 

.8 

-677(.002) 

-862(.002) 

-939(.001) 

-970(.001) 

.98G(.001) 

.95 

g.  meas 

1.13 

.707 

.570 

.536 

.536 

.003(0) 

.001(0) 

0(0) 

0(0) 

0(0) 

.2 

043(.001) 

.055(.001) 

-052(.001) 

.045(.001) 

.040(.001) 

,35 

C.V. 

127(.001) 

,199(.002) 

.251(.002) 

-287(.002) 

-317(.O02) 

.5 

=1.00 

-241(.002) 

-387(.002) 

.498(.002) 

.581(.002) 

,643(.002) 

.65 

.383(.002) 

.587(.002) 

-721(.002) 

.807(.002) 

.864(.001) 

£ 

.598(.002) 

-813(.002) 

.911(.001) 

•955(.001) 

.977(.001) 

.95 

g.  meaa 

1.346 

-813 

.022 

.663 

.548 

.003(0) 

.001(0) 

0(0) 

0(0) 

0(0) 

.2 

.044(.OOl) 

.055(.001) 

•052(.001) 

045(.001) 

-039(.001) 

.35 

e-v. 

.128(.001) 

.200(.002) 

.262(.002) 

.291  (.002) 

.315(.002) 

,5 

=1.82 

.240(.002) 

.388(.002) 

-499(.002) 

-582(.002) 

.646(.002) 

,65 

.381  (-002) 

.588(002) 

-724(.002) 

.808(.002) 

.602(.001) 

.8 

•596(.002) 

-810(.002) 

-910(.001) 

.954(.001) 

.977(.001) 

.95 

g.  meaa 

1.35 

.811 

.621 

.561 

-551 

Let  us  start,  with  the  2’  factorial  tlesigii.  Since  f is  approximated  by  Xn-|i- 
degrees  of  freedom  depends  on  the  niimber  of  observations.  When  the  n,’s  are  in- 
creased, so  is  the  number  of  degrees  of  freedom.  We  have  considered  sue  fixed  sets  of 
probabilities  (.2,.35,.5,.S5,.g  and  .93)  and  acrordingly  obtained  the  values  of 
r.  But  due  to  the  increasing  degrees  of  freedom,  the  values  of  r'a  are  also  increasing 
to  gel  the  desired  tail  probabilities,  so  we  have  differrnt  sets  of  r values  for  diflerent 
n,’s.  The  saute  conclusions  apply  to  2^  and  2*  designs  also.  The  approximation  of 
the  quantiles  of  the  chi  squared  distribution  is  poor  for  the  area  below  the  median 
while  the  approximation  Is  quite  cluse  for  those  above  the  median.  For  example,  we 
have  calculated  the  goodness  men,sures  separately  for  all  the  probabilities  above  and 
below  median  for  some  of  the  designs.  We  have  the  follawing  results: 

Design  c.v.  u,  good,  iiieas.  below  ntedian  good,  nieas.  above  median 
? 3 ^63  B05 

.23  4 ,486  .Om 

2*  .23  5 . 532  .004 

So,  for  a right-tailed  test,  the  )cJ.„-approxiinaiiou  performs  well  for  the  lack  of  fit 
test.  Also  iioticable  from  Table  3.2  that  the  value  of  goodness  measure  is  minimum 
at  n,  = 3 for  a 2‘‘  factorial  dralgn.  Even  though  from  the  theorem  the  actual  and 
the  approximated  critical  values  should  converge  to  the  true  chi-squared  values  as 
n,’s  are  increased,  due  to  the  increase  of  degrees  of  freedom,  values  of  r and  hence 
those  of  cj(t)  are  also  increased.  Therefore,  once  for  a particular  set  ofrij’s,  ci(t)  and 
cj(t)  are  close  enough,  any  further  increase  iu  replications  would  contribute  to  an 
increase  in  the  degrees  of  freedom  and  result  in  high  values  of  simulated  ej(r)'8.  The 
recommended  number  of  replications  for  a 2'^  factorial  is  therefore  n,  = 3.  Siuiilarly, 
looking  at  the  S’  and  2^  factorials,  the  recommended  number  of  replications  for  all 
the  designs  fow  low  cv  will  be: 

2’  factnri^  - 3 
2*  factorial  - 4 
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2*  factorial  - 5 

In  general,  baeetl  on  these  results  we  might  say  that  n,  = it  + 1 replications  are  needed 
for  a 2*  factorial  design  in  the  case  of  low  ev, 

Tliree  different  values  of  cv's  were  considered  for  all  the  first-order  designs.  It  is 
seen  from  all  the  three  cases  that  if  the  cv  inrreases  then  the  goodnese  measure  is 
also  Increased.  In  some  rases,  if  the  cv  Is  high  then  more  replications  are  needed.  Fbr 
example,  in  the  2^  case,  we  require  4 repliiations  when  cv  = 0.23  while  5 replications 
are  needed  for  cv  = 2.007. 

For  firstorder  designs,  if  the  number  of  variables  is  increased,  then  to  approximate 
the  tail  probabilities,  more  replications  are  needed.  Also,  it  is  obvious  that,  the  larger 
the  number  of  design  points  a design  matrix  has,  the  greater  the  number  of  replications 
tliat  are  required  (e.g.  in  case  of  first-order  designs,  2*  factorial  needs  n = 5).  Tiie 
simpler  the  design,  the  smaller  the  number  uf  replicatiniis  needed  for  closeness  of  the 
approximation. 

A measure  of  data  unbalance  ^ Is  defined  as  (Khuri,  1987], 

— <fli<l 

If  all  the  rv’s  are  same,  or  the  design  Ls  balanced,  then  the  upper  bound  is  attained 
otherwise  the  smaller  the  value  of  0,  the  more  unbalanced  the  design.  Table  3.7 
displays  the  approximate  tail  probabilities  of  ^ as  well  as  those  of  Iteeping  the 
total  number  of  observations  fixed  at  n = 36  in  the  case  of  an  unbalanced  2^  factorial 
design  for  cv  = 0.23  and  for  different  values  of  d.  It  is  seen  from  Table  3.7  that  the 
smaller  the  value  of  the  poorer  is  tiie  approximation,  since  the  value  of  the 
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goodness  of  measure  increases.  So  it  can  be  inferred  that  imbalance  of  the  design 


TABLE  3.7 

Simulated  tail  probabilities  for  uubalaiiced  2^  factorial 


A similar  profsdure  was  coiiductod  for  second-ortler  drsigns,  wliers 


= A + ^ dix,i  + ^ Siixl  + S H Afraid'  + <0 

arid  the  trua  mean  response  is  assumed  to  be 

P„  = Ai  + SAiii  + ^AWa  + ZZAfrdiu’  + ^ABitJ+  XJ 

with  Var(vy)  =i^,  j = l n,  i = l,...m. 

We  have  used  different  CCD's  with  different  axial  points  (o)  at  a distance  of  o 
froin  the  desifpi  cenler  (Kliuri  and  Cornell  1996,  p.  120),  as  well  as  a Box-Behnken 
design  and  a 3^  factorial  in  three  variables  to  cbeclr  the  closeness  of  the  critical  values 
(Tables  3.M.13). 


TABLE  3.8 


Deiiigu  settiuKS  for  central  composite  design 


TABLE  3.9 


SimuJated  tail  probabilities  for  CCD  (c.v.  = .9569) 


TABLE  3.10 


Dwlgn  settiLgs  for  Box-Behukeu  design  with  3 factors 


TABLE  3-11 


Simulated  tail  probabilities  for  Box-Belinkeii  design  (c-v-  ~ .9520) 


TABLE  3.12 


Dcsigu  settiugs  for  3’  factorial  with  center  point 


Coutd. TABLE  3.12 


TABLE  3.1 


Simulated  tail  probnbilities  for  3^  desigus  (c.v.  = -B733) 


As  with  thf  first-orriw  designs,  similar  patterns  appear  in  second-order  designs 
too.  The  following  are  the  goodness  ineasnres  below  and  above  the  median  for  the 
three  second-order  designs. 

Design  c.v.  n,  good,  meas.  below  median  good,  meas.  above  median 
ccd  .9569  4 .569  .002 

Box-Behn  .9520  2 .531  .016 

3’  .9733  0 .594  .011 

Also  the  recommended  number  of  replications  for  all  the  designs  for  which  the 
approximation  is  best  (i.e.  the  goodness  of  measure  is  minimum)  are  displayed  in  the 
following  (from  Tables  3.9,  3.11  and  3.13): 

CCD  -4 
Bca-Behnken  - 2 
3^  factorial  - G 

For  second-order  designs,  it  is  obvious  that,  the  larger  the  number  of  design  points  a 
design  matrix  has,  the  greater  the  niimher  of  replications  required  (e.g.  3^  factorial 
vs.  Box-Behnken).  Tlic  simpler  the  design  is,  the  smaller  the  number  of  replications 
needed  for  closeness  of  the  approximation.  In  the  next  table  the  effect  of  the  location 
of  the  axial  point  a on  CCD  (for  same  c.v  — .9569  and  n<  = 4)  is  displayed. 


TABLE  3.14 


Simulated  teiil  probabilities  for  different  CCD’s 
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For  CCD,  the  choice  of  the  setting  of  axial  point  « does  not  make  much  difference 
to  the  approximation  (from  Table  3.14). 

Next,  we  are  interested  in  testing  Hn  = (I.  We  reject  the  null  hypothesis 

observed  T > • 

The  power  of  the  test  is  approximately  given  by 

Under  any  /i,  we  have  seen  that  T is  distributed  as  a non-central  x*  r.v.  with  non 
centrality  parameter  A.  Therefore,  we  apitroximate  the  power  of  the  test  using 

Fbr  different  values  of  the  confidence  eoeffreient  1 — o*,  and  different  vaiues  of  A.  the 
power  is  calculated  for  both  T (using  simulation  as  before]  and  for  (‘^able 

3.15).  To  check  the  power  of  the  approximated  test,  we  ronsidered  a 2^  factorial 
design  with  cv  = 0.23  and  n,  = 4 replications.  The  same  sets  of  r = kn-pa*  ^ 
were  chosen  to  compare  the  power  using  both  test-statistics  T and  We  reject  Hq 
at  the  .05  significance  level  if 

observed  T > = 46.1943 

Therefore  the  power  of  the  test  is 

Plr  > 

In  Table  3.15  we  compare  the  power  for  tire  above  test  with  that  of 


TABLE  3.15 


Power  for  f tuid  uou-eeutral  for  different  A 


A 

r 

D.275 

13.099 

50.20 

89.678 

35.148 

(0.809) 

1 

(0.9S3) 

1 

(1) 

1 

(1) 

29.376 

0.861 

0.972 

1 

1 

(0.613) 

(0.943) 

(1) 

(1) 

33.381 

0.468 

0.793 

0.998 

1 

(0.413) 

(0.867) 

0) 

(1) 

38.466 

0.197 

0.522 

0.973 

1 

(0.210) 

(0.715) 

(0.999) 

(1) 

46.194 

0.058 

0.253 

0.800 

0.993 

(0.054) 

(0.430) 

(0.994) 

(1) 

53.486 

0.023 

0.130 

0.704 

0.967 

(0.011) 

(0.209) 

(0.973) 

0) 

It  ran  b?  noticed  from  T^ble  3.15  that  x’j,  is  more  powerful  than  the  simulated  T 
except  for  small  values  of  A (A  = .275)  and  lower  values  of  t.  So  we  can  conclude  that 
the  x’-  approximation  in  case  of  rejecting  lack  of  fit  hypothesis,  when  the  alternative 
is  right-twled,  works  well  for  most  of  the  first  and  second-order  designs. 


CHAPTER 


OPTIMIZATION  TECHNIQUES  UNDER  GLM 
il  .liitroductiuti 

In  nspoiisr  surlnrn  nipilioriology,  one  of  tlip  prorediirns  for  dMerminiiig  die  valued 
of  tlie  explanatory  variables  that  optimize  a seejnid-order  response  function  is  ririge 
analysis.  Briefly,  tlie  metliod  of  ridge  analysis  is  described  as  follows.  Let  us  consider 
fitting  a second  order  model  in  * coded  variables  of  the  form 

y — 3o  + x'dt  + x'Bx  + f 
where,  0.  = {3i,-.-.dk)'  and 


/ Ai  /?u/2  ...  3,k/2\ 

o-  : : : : 

WJ*i/2  fl«/2  ...  / 

is  a symmetric  k x k matrix.  Tlie  objective  here  is  to  fiud  the  optimum  (maximum 
or  minimum)  value  of  y over  the  experimental  region  (a  region  of  interest  contained 
in  the  region  in  wliich  the  experiments  can  actually  be  performed)  where, 

V = Sa  + ®'4.  + x'Bx 

and  0,  and  B are  the  least  squares  estimates  of  0,  and  B respectively.  The  optimum 
may  occur  outside  tlie  experimental  region.  In  ridge  analysis,  y is  optimized  subject 
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to  the  constraint  i?  = R’  for  various  values  of  R corresponding  to  hyperspheres 
contained  within  the  experimental  region. 

In  Section  4.2,  the  concept  of  ridge  analysis  is  presented  as  well  as  modified  ridge 
analysis  under  the  usual  assumption  of  normal  errors  with  equal  variances.  We  .shall 
extend  the  modified  ridge  analysis  procedure  to  include  situatious  involving  heteroge- 
nous variances  in  Section  4.3.  .4n  example  along  with  concluding  remarks  are  provided 
in  Section  4.4.  Section  4.5  deals  with  a more  general  problem  in  the  generalized  linear 
model,  fn  Section  4.G  the  use  of  modified  GLM  (Method  II)  in  the  case  of  modified 
ridge  analysis  is  propo.sed.  Finally,  an  example  is  presented  in  Section  4.7  to  demon- 
strate locating  the  optimum  number  of  inedible  kernels  in  tbe  popcorn  experiment. 

4.2  Ridge  Analysis  in  the  Case  of  Normal  Errors  with  Equal  Variances 

The  optimization  is  accomplished  by  using  the  method  of  Lagrange  multipliers. 
Consider  the  function 

F = i - v{x'z  - R}), 

V being  a Lagrangian  multiplier. 

£.0  (4.2.,) 

By  choosing  u larger  (smaller)  than  the  largest  (smallest)  eigenvalue  of  B,  values  of 
zi,...,£t  can  be  found  that  result  lu  maximum  (minimum)  values  of  y. 

Tlie  model  for  the  response  vector  Is 

y = Xa  + e. 

where 


£(e)  = 0 and  Var(e)  =<>’/, 
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and  X = ....  f{x„))'  is  a full  column  ranl<  matrix  of  order  n x p wliere 

p = (1  + 2t  + it(k  - I)/2)  with  X,  = (ii„ . - . and 

f'{xi)  = (l,*i At  any  particular  point  x 

in  the  experimental  region, 

l'ar(v(x))  = oV'(x)(X'X)-‘/(») 

If  the  design  is  rotatable,  then  V‘or(ji(®))  is  a function  of  R.  However,  if  the  de- 
sign is  not  rotatable.  Var(y(a;))  can  vary  appreciably  on  a hypersphere  of  radius  R. 
Khuri  and  Myers  (1979)  proposed  a modified  ridge  analysis  procedure  by  adding  an- 
other constraint  to  the  standard  ridge  analysis.  The  added  constraint  is  developed  as 
follows; 

Suppose  p), . . . , ^ are  the  eigenvalues  of  X'X.  Then 


max  e,  mm  ft 

Let  V = |ci  : : Vp]  be  an  orthogonal  matrix  of  corresponding  eigenvectors  of 

X'X,  then 

X'X  = Vrfi03(ft)V' 


It  follows  that 


If  Wi  = («oiit/H, ..  .,Uh,Vin. •-•lUfcti.  Via, ---.Utin,)',  then 

f{x)'vi  = V^+x’Ti+x'TiX  , 


symmetric 
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The  small  values  of  ft  contribute  signiHcjintlv  to  producing  large  values  of  Vor(y(*)). 
Therefore,  y(x)  can  be  opUmised  under  the  addwl  constraint 


where  uom,  T„  and  T,„  are  the  values  of  no,,  r.  and  T,  respectively,  that  correspond 
to  p„i„,  the  minimum  eigenvalue  of  X‘X.  The  constant  9 is  chosen  as  the  largest 
value  taken  by  |tiD„,  + x't„  + x’T„x\  at  the  design  points.  Let  us  now  consider  the 
function 

— V ~ + x'TfnX  — r)  — A(x'x  — fl*), 

where  v and  A are  Lagrangiau  multipliers,  and  r <q. 


.According  to  Myers  and  Carter  (1973),  values  of  1/  are  chosen  first.  For  a fixed  value 
of  V,  A ia  ciiosen  to  be  larger  (smaller)  than  the  largest  (smallest)  value  0!  B - uT„ 
in  order  to  achieve  a maximum  (minimum)  value  of  y. 


Suppose  we  assume  a second  order  rondel  of  the  form 


where  p is  the  mean  of  a response  variable  y and  are  the  explanatory 

variables.  Suppose  yi,.  ...y,  are  the  independent  response  values  so  that  y = n+e  = 
X0  + € where  E(e)  = 0 and  V'iir(e)  = £ = a^V  with  V being  a diagonal  matrix. 
The  vector  of  predicted  responees  is  given  by 


^ = 0 =>  (B-uT„-  AI)x  = i(r-r„  - 0.) 


(4.2.2) 


M = A + Ail?  + H BiiUXi 


(4.3.1) 


y = x0 


(4.3.2) 


wherp  3 is  ihs  weighlwl  l»asi  squarp  ssi.imatp  of  0.  For  a paiticular  dtsign  pcant,  x. 


fix]  = 


Var(i(*))  = fix){X‘S-'Xr'f{x) 


(4.3.3) 


In  modified  ridge  analysis  (Khuri  and  Myers,  1979}  w«  optimize  i subject  to  the 
ronstraint  x'x  = /?^  as  well  as  tile  ronsiraint  on  |nrM  + xV„  + x'TmX|. 

For  the  above  situation  and  assuming  S to  be  known,  we  need  a slight  mod- 
ification of  the  modified  ridge  analysis.  Suppose,  m, ..  ,0,  are  the  oigeiivalucs  of 
X'E"'X.Then  we  have. 


Let  us  write  X'S"'X  = 5D(a,].S' where  S is  an  orthogonal  matrix  of  orthogonal 
eigenvectors  of  X'H~'X,  then 


where  Sj  is  the  ith  rx>lumn  of  S ruid  ir<  of  the  form  ar  = (sq,,sh,  . ...ssj,sn„.  ...ste,, 
Then, 


/'(»)/(») 


Vavivix))  = 1 


/'(x)*i  = So, -i-xVi + ®'SjX  and 


where,  V-i  = («i„ . . - , s*,)'  and 
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Using  thp  SAmr  Argument  as  in  moctiiied  ridge  lumlyiiis,  we  restrun  the  portion  of 
l'or(ji(x))  which  corrcaponds  to  n™,„  the  Binallest  eigenvaiue  of  X'H~'X,  so  that 
the  variance  of  the  predicted  response  can  be  reduced  within  the  r^iuii  explored  by 
ridge  analysis.  So  we  can  use  the  Lagrange  multiplier  method  ou  the  function 
/ = i-  u[st„  + x'rti„  + x‘S„x-r)-  X{x'x-  R^) 
where  sg„,  tl>n>  S„  are  the  values  of  sg,i  'I'l  and  S,  rnrrespunding  to  a„,„  and  r<q, 
q being  the  maximum  value  of  |so,n  4-  3^>bm  + over  all  design  paints.  Solving 

^ = 0 we  obtain 

(B-vS„-A7)x  = (4.3.4) 

Enuation  (4.3.4)  is  solved  using  Myers  and  Carter's  (1973)  method  by  fixing  u and 
choosing  A larger  (smaller)  than  Che  largest  (smallest)  eigenvalue  of  B - for 
achieving  a mnximiim  (minimum).  The  optimum  response  y is  obtained  for  different 
values  of  r and  H. 

In  practical  situations,  S Is  unknown.  We  use  an  Ultimate  of  S instead.  Assuming 
£ to  be  diagonal  and  that  ii,  > I replications  are  available  at  the  ith  design  point, 
we  have  £ = diag(e^, . . . ,o4)  where 


Pi,  is  the  jth  observation  for  the  ith  design  point  j = I,...,n,,  i = and  y,  is 

the  mean  of  the  n,  observations.  Let  ii  and  lij  be  the  estimates  of  Sj  and  n,  discussed 
previously  using  £.  Then,  the  function  we  are  interested  in  optimising  is 

/■  = P - )i(S[lm +*'V’m “ r)- A(l'x  - ) 

Solving  ^ = 0 results  in 

(B  - uS„  — Xl)x  = ~ 
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Fnr  6xed  v.  A is  chosen  lArgCf  (smaller)  than  the  largest  ^smallest)  eigenvalue  of 
(B  - uS„t)  for  a maximum  (minimum)  of  y. 

4.4  Example 

As  an  example,  the  design  front  Khuri  and  Myers  (1979)  is  considered  for  a three- 
variable  model  (Table  4.1).  Six  sets  of  replicated  observations  for  15  design  points 
were  generated  from  itormal  distributions  witii  means  Pi  s f'{Xi)0  and  variance  <r^ 
1 = 1,  ...15,  for  some  preassumed  values  of  0 and 

For  Che  first  part  of  the  example,  we  ignore  the  notion  that  the  observations  are 
heteroscedastic.  We  want  to  maxiitiue  y in  this  problem.  So  we  perform  the  ridge 
analysis  assuming  that  Vnr(y)  = 17^/.  We  calculate  the  eatimales 

0 = {X-X)-'X-y 
- X(X-X)-’X-]y/(T,-p) 

with  n = 90  and  p s 10.  The  modified  ridge  analysis  is  applied  next.  The  minimum 
eigenvalue  for  X'X  is  .1926.  For  all  the  design  points,  the  values  of  loo,,  + »'’'ra  + 
z'T„z|  iwere  calculated,  where  the  largest  value  was  .0669.  So,  y Is  maximized  under 
the  constraints  R < 2.1876  and  r < .0869.  For  dilfereot  values  of  R,  j and  the 
estimated  values  of  y for  several  values  of  r are  displayed  in  Table  4.2.  Some  values 
of  r higher  than  .0869  are  also  considered. 


TABLE  4.1 


Desigu  and  the  observations 


TABLE  4.2 


Results  of  modified  ridge  analysis 
iguoriug  heteroscedasticity 


R 

I.8C 

1.41 

1.21 

1.02 

0.89 

0.72 

0.64 

r 

1.75 

0.89 

0.58 

0.34 

0.20 

0.05 

0.005 

I, 

0.54 

0.43 

0.37 

0.33 

0.29 

0.24 

0.23 

1.09 

0.92 

0.83 

0.75 

0-68 

0.S8 

0.53 

1.4 

0.98 

0.79 

0.62 

0.50 

0.34 

0.27 

maxy 

23.88 

16.01 

12.99 

10.54 

8.99 

7.13 

6.38 

K^r{y) 

426.52 

18.22 

55.37 

24.41 

12.99 

6.22 

5.24 
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In  the  next  part  of  the  example,  we  eoiieidered  the  six  replications  of  the  obser- 
vations in  order  to  estimate  f ==  1, 15.  Once  they  are  estimated,  S is  obtained. 
Tlie  minimum  mgenvalue  of  (X'^~^X)  is  .0129.  Again,  for  all  the  design  points 
valuas  of  |Iim,  + x'^„  + z'SmZ^I  were  obtained  to  gel  an  idea  of  q.  The  largest 
value  In  this  case  is  5 = .1176.  So,  modified  ridge  analysis  was  performed  under  the 
constraints  R < 2.1876  and  r < .1176.  Table  4.3  reveals  maximnm  values  of  ji  and 
V'ar(p)  for  different  combinations  uf  Rand  r . By  comparing  Tables  4.2  and  4.3  it  can 
be  noticed  that,  there  is  a tradenif  between  high  responses  and  quite  .small  Var(p). 
Under  the  homogeneity  assumption,  from  Table  4.2,  for  a radius  R = 1.86,  a response 
y = 23.88  has  VaT{y)  = 426.52.  A similar  set  (ba.sed  cm  the  values  of  the  radius  and 
i)  ill  the  second  case  (assuming  heterogeneity)  gives  for  R = 1.9936,  a p = 25.76  with 
V'ar(p)  = 20.16.  So,  it  is  very  mudi  advisable  to  check  first  for  possible  variablity  of 
the  responses  and  if  detected  the  modification  of  the  modified  ridge  analysis  should 
be  used. 


TABLE  4.3 


Results  of  modified  ridge  analysis 
under  heteroscedasticity 


From  Che  tables  we  note  a substantial  amount  of  reduction  in  the  prediction  vari- 
ances in  the  case  where  the  variances  are  assumed  to  be  heterogenous.  So,  it  is 
aiways  advisable,  before  undertaking  any  optimiaation  problem,  to  first  test  for  the 
homogeneity  of  variances  and  then  apply  the  modified  ridge  analysis  accordingly. 

One  drawback  of  the  modified  ridge  method  wlien  there  are  unequal  variances  is 
that  E is  random  and  so  are  S„  and  To  take  this  randomness  into  account, 
confidence  interval  for  Che  largest  (smallest)  eigenvalue  of  S - i/S„  is  proposed  in 
the  following  Section  as  dweribed  in  Carter,  Cliinrhilli,  Myers  and  Campbell  (1986). 
In  the  next  Section,  we  shall  extend  the  niodified  ridge  method  while  accounting  for 
the  randomness  of  the  estimates  In  case  of  generaiized  linear  models. 


4.5  ModlHal  Ridge  Aiialvsia  under  GLM 


In  this  5«lion,  implemi-ntation  of  modified  ridge  analysis  under  the  fraiiiework  of 
the  generalised  linear  model  (GLM)  Is  disnissed.  Under  GLM,  the  response  y belongs 
to  an  exponential  family  with  a probability  distribution  f{y;  9).  The  linear  predictor 
1)  explains  the  empirical  relationship  between  the  mean  p of  p and  the  explanatory 
variables  ij,. ..  ,ix.  Thus,  if  i)  = f'{x)0,  llieii  ij  is  related  to  p by  a link  function 
rj  = p(p)i  where  g{-)  is  a strictly  monotone  and  differentiable  function.  Suppose  that 
it  is  desired  to  uiaximize  (or  minimize)  the  mean  response  p.  Since  s(-)  is  strictly 
monotone,  this  is  equivalent  to  inaximixing  (or  minimizing)  q.  Let  us  assume  that  a 
second-order  model  is  fitted  to  q i.e. 

q = nx)0 

= 00  + x'0.  +a:'£lx 

The  method  of  modified  ridge  analysts  is  applied  to  q.  Let  us  define  the  random 
variable 

2|  = ff(/ri)  + (B.-/r.)^l„=e. 

that  is,  2,  is  a first-order  ^proximation  of  g{y,)  in  a neighbourhood  of  p,.  Now, 
E(z.)  = q,  and  from  Sertion  (2.2). 


Let  IV  = diafl(tii ,u’„).  Then  by  weighted  iterated  least  squares  we  have  the 

following; 


0={X'w  'xy'x'w' 
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where  0 nntl  W are  the  eatimaies  of  0 and  W at  the  m-tli  stage  of  the  iterative 
procedure  (Equation  2.2.2).  The  varianre-rovanance  matrix  of  ^ is  approximately 
given  by  (2-2-3) 

V'nr(fl)~(X'lV''x)-' 

V'nr(^(®))  = Var{f{x)0) 

= f{x){X'W-'x)-'f{x) 

We  now  use  the  principle  of  modihed  ridge  analysis  to  determine  the  optimum  of  i). 
Suppose  that  the  uiinimuin  eigenvalue  and  eorreaponding  eigenvector  of  X'W~'X 
are(™„  and  d„,  = (do™ iiim,  • - • We  define,  ui„  = (i5i, dim)' 


diiv.  dij,«/2  - d.,„/2 

-symmetrir  Sttm 

The  values  of  and  will  be  estimated  by  and  d,„,  respectively,  which  are 
the  minimum  eigenvalue  and  corresponding  eigenvector  of  X'W  'x.  Similarly,  uf.„ 
and  12ni  are  estimates  of  tOm  and  flm- 
The  objective  function  in  this  case  is 

Fa  = f)  - + *V„  + x'n„®  - r)  - A(a'x  - T0] 


^ = 0 =e  (B-en^-AI)®  = |(vur-„-4.)  (4.5.!) 

With  modified  ridge  analysis,  we  clioose  A to  be  larger  (smaller)  than  the  largest 
(smallast)  eigenvalue  of  B - for  fixed  v to  achieve  the  maximum  (minimum) 


value  uf  i).  Now,  siiiw  ui'„,  and  lilm  are  random,  so  u:  the  value  of  A.  Let  us  denote  the 
largest  (smallest)  eigenvalue  of  S - uQ„  by  i and  Its  estimate,  the  largest  (smallest) 
eigenvalue  of  B - by 

LEMMA  4.1:  The  protiabUity  tiiat  A is  larger  than  the  largest  true  eigenvalue  of 
B — vQ„  is  iucreased  if  A is  chosen  to  be  larger  than  the  upper  limit  of  the  symmetric 
confidence  interval  of  the  largest  eigen  value  of  B - i/{l„  than  choosing  a A greater 
thau  the  largest  (smallest)  eigenvalue  of  B - 

Suppose  tfo  is  the  true  value  of  the  largest  eigenvalue  of  B - u[l„.  i is  its  point 
estimate.  Let  be  the  lower  and  upper  limits  of  a confidence  interval  on  <h, 

respectively.  Note  that 

*{i)  < ^ < rt(j) 

since  i - o(l  - a/2)s.e{i)  and  am  = i + a(a/2)s.e(i4)  where  a(k)  is  tlie  *th 
percentile  for  the  relevant  distribution  of  Consider  the  events 


S = |A  > ^1 
C = (A  > 01 

Then,  according  to  the  statement  of  tiie  thforem,  we  need  to  show  that  P{B\A)  > 
P(B\C).  Now, 


‘ ' P(A>0) 


Now,  4 can  be  either  underestiiimted  or  overestimateH,  i.e,, 

0 < Aj  or  0 > 00 

Since  0 is  always  less  than  0(2).  in  the  case  of  0>  0o  we  will  have 
00  < 0 < 0(2) 

and  in  the  case  of  0 < 0o,  we  have  one  of  the  two  poseibilities 

0 < 0(j)  < 00 
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0 < 00  < 0(2) 

Thus  three  possible  situations  exist  : 

(1)  0O<0<0(2) 

(2)  0 < 0(2)  < 00 

(3)  0 < 00  < 0(2) 

In  ease  (1).  P{B\A)  = 1,  P{B\C)  = 1.  In  case  (2).  P{B\A)  = P{B\C)  = 

However,  A > 0,2)  ^ A > 0-  So,  /’(A  > 0(2))  < P(A  > 0)  , i.e„  P(S|/1)  > 
P{8\C).  In  case  (3),  P(S|>1)  = 1 and  P{B\C)  = So.  in  all  the  three  cases, 

P{B\A)  > P(B\C). 

Let  us  now  find  a confidence  interval  for  the  largest  eigenvalue  of  B - i/Sl„  for  any 
fixed  w.  This  is  done  as  follows:  Suppose  C is  an  approximate  100(1  - confidence 
re^on  for  0,  the  vertor  of  all  parameters  in  the  linear  predictor.  Then  a 100(1  - q)96 
conservative  confidence  interval  for  any  continuous  function  l(,8)  (as  discussed  in  Rao 
(1973),  and  Spjotvoll  (1972))  is  approximately  given  by 

P i..f  f(l9)<I03)<supl(/S)]>l-« 

0^c  J 
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The  matrix  B.  which  is  of  order  i x Ir,  consists  of  elemenla  of  0.  Aiso,  n„  being  a 
matrix  associated  with  the  eigenvector  of  the  largest  eigenvalue  of  (X'W'X)  is  a 
function  of  U'r  (I'th  diagonni  element  of  IV),  which  is  a function  of  jj„  and  therefore 
is  a function  of  0.  So,  we  conclude  that  d.  the  largest  eigenvalue  of  B - for  a 
fixed  rr,  is  a function  of  0.  We  can  therefore  find  a confidence  interval  of  this  largest 
eigenvalue  by  using  the  following  method  as  desc  ribed  in  Carter,  Chinchilll,  Myers 
and  Campbell  (1986): 

For  large  samples,  0 is  asymptotically  normally  distributed  with  mean  0 and 
esiiiiiated  variance-rovariance  matrix  (X'lV  X)”*),  (Agresti,  1990,  p.  451)  So 
from  Wald’s  (1943)  general  a-symptoiic  results  for  ML  estimators  in  large  samples, 


where  p=  1 +24  + fc(Jr  - l)/2  the  number  of  parameters. 

We  can  use  this  result  to  find  an  approximate  100(1  - a)%  confidence  region  C 
for  0,  namely, 


{0-0y{VaH0))-'{0-0) 


= {0~0Y{X-W''x){0-0)iS4 


P[{0  - 0)'{X'W-'xn0  -0)<  = 1 - rv 


Let  us  write  X'lV  X = PAP’  wliere  A is  the  diagonal  matrix  with  eigenvalues  of 
X IV  X and  P is  an  orthogonal  matrix.  So, 


[0-0nX'W-'X){0-0) 

^{0-0ypAP'{0-0) 


/3  = j3-PA-i. 


Then,  P[w'u  < 1 — n where  i?  = \\ 

Thus  Che  conHdeiiee  region  nbnul  0 hoe  been  transformed  to  a multidimensional 
spherical  region  of  diameter  p*  = X»i-n-  Anderson  (1984,  p.279)  used  a transforma- 
tion of  u from  rectanguiar  to  poiar  ro-ordinatcs  to  find  points  in  the  confidence  recoil 
C.  So  any  number  of  points  in  C can  be  obtained  by  ronsidering  different  values  of 
p and  6,  whicli  is  the  angle  of  transformation,  over  some  specified  ranges. 

Thus  any  point  wiciiin  the  confidence  region  can  be  represented  by  using  the 
foiiowing  spherical  coordinates  (Carter,  Chiiicirilii,  Myers  and  Campbell,  1936): 

Us  = pcosdi  sin6% 


Of  = p cos  6i . . . cos  8^1  cos 


where,  t(|,  are  the  elements  of  u and 
0 < 

< ft  < f 

< 9p-l  < !T 


>•  = 1 (P-2) 


A conservative  100(1  - a)%  confidence  interval  for  any  scalar  function  l{0)  of  0 is 
then  given  by 

nif  f(i9),  sup  K/S)  (4.5.2) 

[0ec  J 

In  our  problem,  we  consider  l{0)  to  be  the  largest  eigenvalue  of  B - For  a 
fixed  value  of  u,  after  choosing  the  vaiiie  of  A to  be  larger  than  the  upper  limit  of  the 


confidt^nL'P  imerval  on  the  largest  pigravalue  of  B - en„,  we  solve  for  x In  equation 
(4.B.7).  We  repeat  this  proi'Mlure  for  different  values  of  f and  obtain  x in  each  case. 
Under  constraint  on  |i||„  + x’wm  + optimum  of  ij  and  consequently  that 

of  pt  are  found.  Also,  we  can  find  (onfidence  intervals  for  r;(x)  s as  well 

as  p(x)  = p(r}(x)]  at  the  location  of  the  optimum,  both  betng  functions  of  The 
conservative  100(1  - a)%  ronfidfiire  interval  for  ij(x)  Is  approximatfly  given  by 


and  the  conservative  approximate  100(1  - a)iit  confidence  interval  of  p(x)  is 


In  Chapter  Two  we  have  seen  that  modifying  standard  GLM  for  log-link  and  logit- 
link,  using  seennd-order  approximation  resulLs  in  smaller  variances  of  the  parameter 
estimates  as  well  as  the  prediction  variances.  In  this  section, we  use  this  mudification  ill 
conjunction  with  modified  ridge  analysis  for  optimizing  the  response  variable  under 
nonnormality  or  in  cases  where  the  response  variances  are  lieterogenous.  We  will 
concentrate  on  Method  II  in  this  Section.  Results  of  Method  i can  be  similarly 
implemented. 

Suppose  Pi,.  ...i/n  are  independently  distributed  according  to  exponential  family 
distribution  with  E{y]  = n and  l'nr(y)  = U = diag{a^,.  ...oj).  Using  the  second 
order  approximation  in  Metliod  II  of  a = g(v).  we  have  E{z)  = ^(.Y/3)  = ^(tj) 
(from  2.4.2)  and  Var(2)  = W = dinj(iu;,  ...,ui;).  In  the  standard  GLM,  we 
have  already  seen  Iiow  to  optimize  tlie  estimated  mean  response  p using  modified 
ridge  anaiysLs.  Here,  we  will  modify  the  procedure  in  the  extended  GLM  case  using 


(4.5.3) 


(4.5.4) 


difffrfm  estimstre  and  variaiires,  Thp  objective  is  to  optimize  #1  or  equivalently  jj 

subject  to  tile  conditions 

(1 1 ®'x  < FP  and 

(2)  |Ao-.  + *W  + a:'n„x|<, 

The  difference  between  tlie  modified  ridge  analysis  under  standard  GLM  and  this 
procedure  is  in  the  estimate  nf  S (Section  2.4,  Equation  2.4.3): 

0 = =0,„  + - *(.V3„)| 

wliicli  is  estimated  by, 

Var(d) 

F(f9)  = ^,  f'”"  = F(/3„) 

and  W''*'"’  is  the  estimated  value  of  IV’  after  the  mth  iteration.  Therefore,  for 
^(»)  = 

Vnr(i7(i))  :r 

Suppose  the  minimum  eigenvalue  and  tiie  corresponding  eigenvector  for 
(FWien-'F)  are  and  a„  = (oo, lOtrn.ttiim,  ..-,a(e-i)icm)'  and  thinr  cor- 

responding estimates  are  and  d„.  The  objective  function  in  this  case  is 
fj  = ij  - ir(iito,  + -e  z'r,„*  - r)  - A(x'x  - tf) 
where  7)„  is  the  estimate  of  7„  = (r»i„ ,otni)‘. 


t«llm 


is  estimated  bv  Tm  and  t <q. 


dx 


(4.6.1) 


From  this  point  on  the  procedure  is  the  same  as  modided  ridge  analysis.  For 
clioosing  the  value  of  A for  fixed  value  oft',  <ve  can  follow  the  same  proredure  discussed 
in  the  previous  section.  For  a maximum  (minimum),  A can  be  chosen  larger  (smaller) 
than  the  upper  limit  (lower  limit]  of  the  confidence  interval  of  largest  (smallest) 
eigenvalue  of  (B  - er„).  Tlien  for  different  values  of  u,  different  upper  (lower)  limits 
for  the  largest  (smallest)  eigenvalue  were  found.  After  solving  for  x using  (4.7.1) 
values  of  |6om  + x'7m  + xT'mX|,  optimum  of  i),  and  consequently  that  of  p are  found- 


As  an  example,  again  vre  consider  the  Popcorn  Experiment  discussed  in  Chapter 
two.  Here  we  try  to  6nd  those  combinations  of  the  three  explanatory  varialrles 
(burner  setting,  auiount  of  vegetable  oil  and  popping  time)  for  which  the  number 
of  inedible  hernels  can  be  minimixed.  Here  p=10  and  for  a 95%  confidence  region, 
Aia.sa  = 18.31.  We  first  find  the  lower  limit  uf  the  confidence  interval  for  the  smallest 
eigenvalue  of  (B  - leF^)  with  confidence  roeffirient  .95,  for  some  fixed  values  of  u. 
Ridge  analysis  is  performed  under  the  constraints  R < 1.732  and  r < .9768,  where 
q = .9768  is  the  largest  value  of  |5s„  + z'uim  + x'n„xl  calculated  over  ail  design 
points.  For  the  combinations  of  u and  A (whicli  are  cliosen  to  be  smaller  than  the 
lower  limit  of  the  confidence  interval  of  4>)  the  valuH!  of  x are  solved  using  equation 


4.7  Exaimile 
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(4.G.7)  and  then  R.  ji(x)  ftiid  V'/jr(^(®))  are  calculated  (Table  4.4).  Table  4-5  gives 
95%  conservative  confidence  intervals  for  the  mean  number  of  inedible  kernels  »i(x) 
for  all  the  design  points  and  the  observed  numirers  of  inedible  kernels. 

TABLE  4.4 

Results  of  modified  ridge  aualysis 
under  geueralieod  liuoar  models  for  the 
popcorn  example 


R 


1.94  1.71  1.39  0.95  0.62 


1.23  0.74  (1.17  0.43  0-74 


-0.25  -0.20  -0-13  -0-04  0.02 


1.43  1.26  1.04  0.72  0.47 


1.29  1.14  0.92  0.C2  0.41 


8.67  9.99  11.86  14.67  16.81 


10.3!  8.26  .5.00  3.56  3.89 


0.45 


0.85 


TABLE  4-£ 


ConBcrvative  95%  conSdeuce  intervals  for  )i[x]  -the  mean  number 
of  inediblc.kpruels  in  the 
popcorn  example 


If  WD  compsrp  tlif  valups  of  fi  at  R = 1.94  and  at  R=  .62,  we  notice  a substantial 
amount  of  increase  in  ji.  As  the  predicted  response  increases,  the  prediction  variance 
decreases.  Therefore,  to  minimiee  the  number  of  inedible  kernels,  we  have  to  compro- 
mise between  increasing  fi  and  increasing  V'ar(^).  The  best  combination  from  this 
table  would  be  (k  = 9.99  at  ij  = -.20.  ij  = 1.26  and  i%  = 1.14  for  fl  = 1.71,  r = .74 
and  Vur(p)  = 8.26.  Also,  the  conservative  conhdence  intervals  of  the  predicted  mean 
number  of  inedible  kernels,  at  each  of  the  15  design  points  were  presented.  The  remits 
show  that  the  observed  itumber  of  iitedible  kernels  are  well  within  the  95%  conhdence 
interval  for  /i(x)  except  those  at  die  center  point. 


CHAPTER 


OPTIMAL  DESIGNS  FOR  GENERALIZED  LINEAR  MODELS 
5,J_lDtrQductimi 

Lpt  • • • , r&nHom  vAriablfS  with  iiiPAns  t^2{x) Pn(x)  and  the 

linear  predictor  is  defined  as 

7|,(l)  = S(#i,(®)) 

Suppose  rj  = (i)i(*), , - . We  fit  tlie  model 

z = Xe  + e (5.1.1) 

where  X is  a function  of  the  design  matrix,  S = (9|,, . . , 9p)'  is  the  vector  of  unknown 
parameters  and  E{z)  = t)  = XB.  Let  W(*)  = IV  = diag(tci,..  .,iir„)  where 

IS,  = V'ar(p,)(^)’ 

an. 

The  weighted  least  squares  estimate  and  its  variance  are  given  by 


and  Vor(0)  ~ (X'lV-'X)-' 

Now.  te,  depends  on  p,  which  in  turn  depends  on  9.  So  to  find  optimal  designs  we 
need  to  use  either  sniiieminl  methods  or  Bavesian  techniques.  In  Section  5.2  four  dif- 
ferent design  opl.imality  criteria,  used  mostly  in  a GLM  tau'ext,  are  discussed.  Tlie 
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conditions  or  stopping  rules  requirwl  for  geiiernling  differpnt  optimal  designs  for  the 
four  criteria  are  derived  in  Section  5-3.  In  Sections  5.4  and  5.5  we  propose  sequential 
and  Bayes-sequential  methods  using  the  stopping  rules  ubtaineil  in  Section  5.3.  Sec- 
tion 5.G  deals  with  the  frequently  used  logistic  regression  model  and  the  expressions 
for  the  four  optimality  criteria  for  both  first-order  and  second  order  models.  The  two 
proposed  sequential  methods  are  compared  for  the  lopstir  model  in  Section  5.7.  An 
exanqile  and  concluding  remarks  are  given  in  Section  5^. 


Let  ^(s)  be  the  estimated  value  of  tiie  linear  predictor  at  X.  Thus  i7(x)  = (f(x)6. 
where  ^'(x)  is  of  the  same  form  as  a row  of  X,  but  is  evaluated  at  X. 

(t)  Q-oplimality  criterion 

Myers,  Myers,  and  Carter(l994)  defined  Q-optimal  designs  on  the  Insis  of  the  av- 
err^e  prediction  variance.  Suppose  that  the  joint  distribulinn  ofpi, . ..  ,p„  is  denoted 
by  /(l(|0iX).  For  a design  measure  { on  H (Khuri  and  Cornell.  1D9G,  p.  433-434), 
wliere  R is  the  experimental  region  of  interest,  M(9,?)  denotes  tlie  information  ma- 
trix wliose  (i,i)th  element  is  defined  as 

M(s,  «)„  = - iog(/(aie,  »))]?(<i®) 

Tile  average  prediction  variatite  criterion  is  given  by, 
jRVar{T,{x))dx 

Ir'^ 

where  l'oc(^(x))  = C(x)Voc(e)C(x),  Vnr(S)  ~ (X'W-’X)-‘  = 

(Agresti,  1990,  p.449).  A Q-optimal  design  is  a design  that  ininimbes  the  APV. 


Let  us  define, 


(5.2.1) 


/Rrfx 

Thus  a Q-optimal  design  is  the  oi>e  that,  maximizes  <3({.d). 

(It)  01  and  02  optimality  eriteria  (Ctialoner  (uid  Larnta  (1989),  p 193) 

The  first  criterion  coiiBidercd  by  Clialonet  and  Larntz(1989)  was  to  rhoose  the 
design  measure  { maximizing  0i((,  9)  where 

0,(«,e)  = log|deiM(9,«)l 
The  serond  eriterion  reiinires  maximization  of 

02(«,e)  = -lr|B(9)Af(0,?)-‘] 

where  B(9)  is  a.symnietricpxp  matrix.  The0i  criterion  is  ec|uivaleiit  to  D-optimality 
ill  linear  models  while  the  02  criteroii  is  equivalent  to  A-optiniality. 

(Ill)  Ae  optimality  criterion  (Jones  and  Mitchell,  1978) 

Here,  we  suppose  that  the  true  mean  of  * is 

r],  = XB  + X.e.  (5.2.2) 

We  aasume  tliat  the  ubaerved  v,*’s  are  independently  distributed  with  expectation 
given  by  (5.2.2)  and  variance  ui,. 

The  model  fitted  initially  is  (5.1.1),  with  the  belief  tliat  tills  model  will  adequately 
approximate  the  true  response  over  the  reginu  of  interest.  If  this  is  incorrect  and 
(5.2.2)  is  the  correct  model,  the  residuals,  when  plotted,  sliould  deviate  from  suggest- 
ing a significant  lack  of  fit.  We  shall  consider  designs  that  increase  the  sensitivity  of 
the  lack  of  fit  test  in  detecting  inadequacy  of  the  assumed  or  fitted  model. 

The  fitted  model  is 


17  = xa 


(5.2.3) 


where  the  estiniate  of  9 


e = (X'W  'X)-'X'W  'z 


The  preriiction  biae  at  a point  x is  given  by 

£W(»)I -’?!(*) 

= £(C'(x)e)-C'(*)0-Cl(*)S. 

wiiere  C.(*)  has  the  same  form  ns  a row  of  X.,  hut  is  evaluated  al  x.  Thus, 

('(x)(x'w-^x)-'x-w-’(xe  + x.e.)  ~ c'(®)«  - Cl(*)9. 
= (C'(X)(X'1V->X)-’XW-'X.  - c:(x)ie. 


It  foilows  that 


|X(X'W-'X)-'X'IV-'X.  - X.]9. 


The  optimality  criterion  we  propose  here  is  a modification  of  Jones  and  Mitcheil's 
(1978,  p.  541)  criteria  whicii  involves  generalising  the  ordinary  least  squares  esticnat- 
ing  method  to  tliat  of  weigiited  ieast  squares.  It  invoives  the  maximisation  of 

g'^ff'W-'Ke. 

= e'^ie. 


L = X({W-'  - W'-'X(X'W-'X)-’X'H'-')X. 


(5.2.4) 


Now  ffupposo  we  define  T ~ 


I'll  = i'ii>  = 

I'oo  = lf^Wn'-'C\=c)Hx  I/O,  = jj^<{x)W-\\(x)dx 

The  optimality  rnterioii  we  ronaicier  licre  is  the  maximization  of  Ae  over  idl  designs 


UO'.LO.dF 

I^dF 


(6.2,5) 


and  rff  is  tiie  differential  surface  area  of  tlie  ellipsoid  6o  = |9.  : 8[TB,  = S\  for  some 
eonstant  6 > 0. 


5.3  Generatiiie  Optimal  Degiens 

Let  II  be  the  set  of  all  design  measures  on  It  and  let  Af  (9,^)  be  the  information 
matrix  for  ^ £ /f.  A design  measure  is  railed  ^optimal  if  it  maximizes  a concave 
function  *(M(0,?),9)  over  II  (Silvey,  1080,  p.l5).  The  set  M = [M(0,?) : ? € ff] 
is  convex.  For  simplicity,  let  us  denote  M(6,  ()  by  M from  now  on.  For  Mi  € M 
and  M-i  6 M,  the  FYediet  derivative  of  ^ at  Mi  in  the  direction  of  M-i  is  (Silvey, 
1080,  p.l8): 

F,(M,.  M,)  =^lim  (l/e)(*((l  - r)M,  + eM,,0)  - 6(M„9)]  (5.3.1) 

Define  to  be  a design  measure  in  R,  such  that 


(*  = 


if  a:  is  a design 


lei  Mx  be  the  corresponding  information  matrix.  Then  the  Frediet  derivative  of  ^ 
at  M in  the  direction  of  Mx  is  denotetl  by  (Silvey,  I960,  p.l8) 

d{(,x)  = Ff(M.Mx) 

THEOKCM  5,1:  If  ^ is  concave,  then  a ^optimal  d^ign  ^ can  be  characterized 
by  one  of  the  following  conditions 

(i)  & maximizes  d{M.6). 

(ii)  & mininiizas  supj.^jjd(^,a:), 

(in)  6np„,flrf(&,»)  = 0. 

Whittle  (1973)  proved  this  theorem  for  linear  modeis,  and  it  was  also  proven  later 
in  case  of  non-linear  regression  problems  under  certain  assumptions  (Chaloner  and 
Larntz,  1989,  p.194). 

Let  us  now  consider  the  different  optimality  criteria  that  were  discussed  in  Section 
(5.2)  and  obtain  the  expressions  for  snpj.jjj<i({,  *).  Rewriting  the  criteria  in  terins 

J-ftda 

«i(M,fl)  = log(detM), 

*(M,fl)  = -tr(B(6)M-') 

Az(M, e)  = 

_ IbJ’-W».){Mu  -M,oM„VMo,)]dF 


M = 
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( Moo  Moi 

M,o  Mu 

M„  = x',w-'x..  Mio  = x;h'-'x. 

Moi  = X'W’-’X.,  Mo,,  = X'lV-'X  and  A{6.)  = 6.6',. 

The  necessary  Frechet  derivatives  of  ft>i{M,6)  and  i^{M,6)  are  (Ciialoner  and 
Larnts  (1989,  p.l95)) 

(i)for(h(M,ff) 

F,,(M,,Mj)  = ir(M,M7')-p  (5.3.2) 


where  p is  the  number  of  parameters  In  8 
(li)  and  for  MM,6] 

M,)  = i.r(B(ff)Mr’MjMr'l  + (5.3.3) 

Next  we  derive  the  Frecliet  derivative  for  Q(M,S).  We  can  rewrite  Q{M,6)  as 

/«<*» 

= -r-> 

where  lnHx  = r.  Then, 

Q(M,  6)  = -r-'  triC(x)M'']rf*, 

where  C{x)  = f(*)C'(*).  The  expression  inside  the  integral  of  Q{M,B)  has  the 
same  form  as  *i(M,  6).  Thus  the  Frediet  derivative  for  Q-optimaJity  is 

Fo(M,.M,)  = r-'y^l.r[C(*)Mr'M,Mr'|dx  + Q(M|,fl) 

= r-‘/^trK'(x)M7’M2Mr’C(*)|rfx  + g(M„9) 

= r-'|^(<‘(*)Mr'M.,M7'C(*)]d*  + g(M,,0)  (5.3.4) 


1(10 


Let  UB  now  write  As(Af,e)os 

A.(M, , 0)  = 

A,(M,.0)  = Je. 

where  and  are  the  partitioned  inatricea  aaaociated  with  My  and  M2  re- 
apeciively,  i,  j = 0, 1 , Then  from  the  definition  of  FVediel  derivative 

= _lim^(l/e)[A,((l  - .)M,  + rMj,  0)  - Aj(M,,  0)j 
= Um(l/e)(A,(M,0)-A,(M,,0)| 

. |„„  II, .lit-  J 

•-•e*  /bj  iIF 

where  Mjj  = (1  - e)Af?.  + eM’.  for  t = 0,l  and  j = 0,l. 


Let  0Q  he  the  initial  value  of  0 and  i^{M,  0)  be  the  optimality  criterion  we  would 
like  to  either  minimize  (or  maximize).  Let  ub  auppose  that  ^{M,  6)  is  to  be  nutxi- 

1 . In  the  first  step  we  start  with  an  initial  discrete  desijpi  Dt,  and  observe  the  data 
Pq  using  this  design. 

2.  We  obtain  a design  point  i,  at  whirl]  siipj.^fld(Dii,  x)  is  attained. 

3.  A new  discrete  design  Dy  is  obtained  hy  augmenting  Da  with  x,.  Let  v,  denote 
the  data  vector  obtained  under  Dy. 
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4.  A Brsl-ordpr  approximation  of  5(1/1)  = *1  oriri  W^j,  the  initial  estimate  of  IV, 
which  is  a function  of  9 arc  calculated  using  the  initial  value  do 

5.  A new  estimate  of  6 is  obtained  at  this  point  using  the  formula 

01  = 

where  Xi  is  the  X matrix  for  the  linear  predictor  corresponding  to  the  design  D\. 
6-  Using  9i  ill  the  next  step,  one  more  design  point  xi  is  obtained  at  which 
d(£>i,a:)  is  attained.  Let  the  augmented  design  and  the  data  be  denoted  by  Di  and 
£/],  raspectively. 

7.  Next,  Zo  = s{yi)  as  well  as  are  computed.  Then  the  estimate  of  0 can  be 
updated  by  the  formula 

0;  = 

where  Xj  Is  the  X matrix  for  the  linear  predictor  corresponding  to  the  design  £>;. 
Since  from  part  (iii)  of  Theorem  5.1,  the  goal  is  to  mahe  sup^.j2d(.D,  x)  — t 0,  we 
continue  the  above  procedure  until  at  the  m-th  stage 
sup  d(D„,*)  <S 
where  d is  a small  positive  number. 


A prior  distribution  on  0 is  assumed  instead  of  an  initial  value.  I<«t  the  assumed 
prior  be  denoted  by  l{9]  and  the  initial  de.sign  by  Da- 

1.  First,  we  generate  9 from  the  prior  distriliution  and  use  tlie  initial  design  to  obtain 
the  response  vector  y^.  Tlie  joint  distribution  of  Vo  and  9 is  given  by 


M»e.9]  = /(Uoie)i(e) 
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2.  Next,  the  posterior  distribution  oi  9 is  obtained,  namely 


;0'’o(yn,e)d0 


where  0 denotes  the  parameter  space  uf  9. 

3.  The  posterior  distribution  oi  9 is  used  to  calculate  the  Bayes  estimate  9ob  of  9. 

4.  Using  Co  and  0os  o (lesi|ii  point  Xi  is  obtained  at  which  sup^gj{d(Co,x)  is 
attained.  Augment  Do  with  X] . Denote  ilie  augmented  design  and  the  data  vector 
observed  using  this  design  by  Di  and  y,  respectively. 

5.  Consider  ^(0|yo)  ^ 't  new  prior  for  9.  The  joint  distribution  of  and  9 is 


Ai(if„e)  = /(if,ie)tA.(e|y„), 


and  the  posterior  distribution  of  9 will  be 


’f'lffllVi)  = 


hi(Vi,9) 

;e'i,(y„0)rfS 


6.  The  corresponding  Bayes  estimates  9\n  is  used  to  find  anotlier  design  point  x-i  at 
which  »npg.^jjri(Cj,®)  is  obtained. 

This  procedure  is  continued  until  < tS  for  a small  positive  S. 


A Bernoulli  response  p,  is  ubsen’ed  at  a point  z,  of  z,  i s 1,. 
regression  prolilein,  the  linear  predictor  rj(z„S)  is  related  to  the  i 
througii  the  equation 


7|(z„e)  = log| 


p[i„9)  1 


For  a logistic 
ofy,,p(zj,9), 
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5.6,1 Four  optimality  cnteria  for  a Hrit-urdef  lopatic  inodd 

For  tlip  one  explanatory  variable  case,  we  consider  the  first  order  linear  predictor 
rj(i.,9)  = 9o  + 9ii, 

The  logistic  regression  model  is  such  that  the  mean  p(x,,  9)  of  y,  can  be  written  as 
p{x„S)  = 1/(1  +exp(-(9o  + 9|ii))). 

There  are  two  parameters  in  9 = (9o,  9i 

Usually,  several  otoervaiions  are  obtained  on  a binary  response  at  a given  i.  Let 
Vi  be  the  number  of  successes  when  n,  observations  are  taken  at  xr  (t  = 1, m).  In 
this  case,  the  log-likelihood  runction  is 

t(»)  = ^ j + SVil“6P(*.,e)  + f)(n,  - v,)log?(z„9) 

~ E V.  IoKP(i.,  9}  + £(n,  - V.)  log?(Zi,  9) 

where  q{x„9)  = \-p{z,.0).  The  information  matrix  is  then  derived  to  be  as  follows, 
(Myers,  Myers,  and  Carter  1994,  p.  4) 


W = E"i  “f  ="iP(tri,S)g(z„0) 


Let  us  define 
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It  can  be  noted  Chat  w’,  t.  2,  < all  depend  on  $.  The  information  matrix  can  be 
simpUHed  In  the  Form 

The  inverse  of  the  information  matrix  is  therefore 

Using  (5.6.2)  and  (5.6.3)  rve  can  obtain  the  forms  of  the  0i  and  ^ optimality 
criteria.  For  we  Imve 

MM.$)  = log(deiM(DA.,e)) 

= log((») 

As  for  ^ suppose  we  are  interested  in  estimating  a function  of  6,  say  k{6).  The 
asymptotic  variance-covariance  matrix  of  the  estimate  uf  k{6)  will  be  (Agresti,  1990, 
p.422) 

= fr|B(e)M(£),v,e)-'| 
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whprp  B{0)  = (^)  (^)  ^ = iMn'  tiilprested  in  estimating  $o 

alone,  then  it(fl)  — So  and  ^ J y Tlien, 


B(0)  = 


(5.6.4) 


Let  UB  dehne . 


*,(M,9)  = -tr(B(ff)M(D»,ff)-') 

= -(r‘+iV) 

rix  Xn*i  with  one  design  point  x^^-i  i.e. 


x««  = d.i„+,) 


Therefore  the  moment  oiatrix  for  this  design  is 


M(i»^„9)  = 


11/*(lw+i,e)  Ul'(lw+|,0)lK+i  \ 


(5.6.5) 


t«'(*N4i.9)  =p(®rrei.9)(’  -P(®«+ipS)) 


If  we  roiisider  M,  = M(D«,fl)  and  M-i  = M(iiv4i,0),  then  from  the  expressions 
(5.3.2)  and  (5.3.3]  wp  iiavp  for  ^ optimality, 

d(D^,I^*l)  = tr(M(D«,9)M(I„+,,0)-')-p 

= ti;'(i«4.|,  9)(|-'  + (iy+,  - i)V»)  - 2 

To  find  a ^i-optimal  design,  Xw4i  is  obtained  as  the  value  of  x at  which  stip^^ffd(Dn,x) 
is  minimized.  Sitniiariy,  for  ^ -optimaiity,  the  Frechet  derivative  is 


d(BK,xs.+,)  = tr[fl(e)M(Da.9)-'M(i^+,,9)M(D«,0)-')  + '»z(M'(D».e).®) 


= V)-(iN*.,.  »)(«<)-’[(»  - - m*\  + ^(M(D«,  9),  0) 

Next  we  ooneider  Q-optimality  for  the  lof^istic  ret^eeeion  model  in  the  oii^variftble 


«id  <'(*)  = (1,1). 

Let  us  consider  a symmetric  experimental  reKioii  aroun-.l  tlie  origin,  e.g.,  R = 

1-a.o! 

, ^ jRC'lx)M(D^,0)-'C(x)dx 


Tlie  Frechet  derivative  for  Q-optimaJity  from  tlie  formula  (5.3.4)  is 
d(D«,j-/v+i) 

= r-' (C(*)Af  (Dw,  0)-’M(xx+„  0)Af{Dw,  0)-'Cl®Hdi  + Q(M(D«,  0),  0) 

= 0)[(sr“(s  - - S)S)=  + - |i^  + gl 

For  Aj-optimality  let  us  suppo.se  tliat  tlie  true  mean  of  the  response  p,  is  given  by 
p'(i„0)  = l/(l+exp(-(0o  + 0|i, + 0ji?))). 

We  assume  that  the  fitted  model  is 


p(i.,0)  = V(l+exp(-(0o  + 0ii.))). 
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111  this  caae.  vi'  = n,p'(ij,9)}'(i„9)  wiiich  deppnds  on  all  thu  three  parameters  On, 
and  62.  To  Hnd  the  expression  for  A^,  we  have 


(5.6.7) 


0,  = 65.  Therefore,  by  (5.2.4), 

L = X',{W-'  - W-'X(X'W-'X)-‘'X'W-')X. 


= Af„  - 

ots  - m^l  - f*  + 12tZ»s»  + 6t»si^ 


where  M„  = D" , w;xl  Mm  = (S;!,  ui,*!?,  S;=i  <3^)  . Mo,  = M',,. 


(5.6.8) 


before  and 


SlmpllFying  the  expression  of  Ag  in  (5.2.5)  we  obtain 


As(A/,9)  = 


/e.'IF’ 


and  the  Frerhet  derivative  for  Aj  optimality  can  be  written 


d{DK,Xs*\) 

^,1 

^ 11^  (,  , - Je.  tr(A(9,)UM„  - MiqM^’Mq,)  - (M’,  - Af'„M„VM',)}lrfF 

.-to*  J^iiF 

M1=U|•{XM^u6){(^XN+,),C.^XNtx))XW‘^XN*,,0)^C{Xn*^)X.{IN*^)' 

Thereforp, 

m™  = (1  - OMJ,  + eK(i«4.)C'(*«4i) 

M„  = (1  - ,)Mi,  + f>/’-(x«+,.9)<(i»4i)f'.(x«+.) 

M„  = (l-4)Ml,  + w-(r«»,.eK.(x»+i)C'(XA-4i) 

M„=(l-e)Ml,+P«-'(3-»^„eK.(x»4.)Cl(x«-H) 

Following  the  ^implication  of  the  expression  of  d{DN,XN*x)  the  same  way  as  on  p. 

187  of  WjjesinliA  Kliuri,  1987,  we  have 

•(Cl(x«+i)  - C'(xA'+i)Mi,"'WJ,)|rfF’-  Aj(M,,e) 

= ^pjg  ®J^“''(xw+i.9)[i»*,  - (a  + is’)*/fs- 


(5.6.9) 


ts‘)!,fdF 


Thf  tfrm  inside  ihe  integral  fr[A(fl,)(M„  - M.oM^’Moi))  h«  the  same  form  of 
Aj(M,0)  in  the  case  oflinear  models.  Now,  A,(M,0)  in  the  linear  model  case  is 


equation  (5.6.10). 


1:2 13) 

where  u,;  = n.Ml<,(?)0-p(^.,e)]. 

The  information  matriN  ran  be  simplihed  as 


= Iob(<1«M(D^,9)) 


j in  estimntins  i(0)  = £>i,  then  a).  1 u„„. 


^(:H) 


<S,(A/,e)  = -tr(B(9)M(D„,e)-') 


vt  + imtx  + 4gti’  - s® 

(t;s(  - m^i  - 

For  estimation  of  Sij  or  S3  w.  can  di^ne  B(9)  accordingly  and  the  expressions  for 
^(M,  9)  can  be  easily  obtained. 

For  01  -optimality,  tlic  expression  for  the  Frecliet  derivative  is  of  the  form 


For  09  -optimality  in  estimating  Si  alone,  the  FVediet  derivative  is 
d(D«,iiv+i) 

= tr(B(9)M(D,v,e)-‘Mliw+i,  9)M{Dv,  9)-']  + 05(M.  9) 

= u>'{i^.^,,9)(i)ls-m®f-s®)-®{(m/(iiv+i  -f)* 

-2m((i/»+i  -l)i+  2.rt(xy+i  -J)’i 

- J)  - - f)  - 2s’{S  - OM)®)  -(-  0s(M,  9) 

Next  let  us  consider  Q-optiitiality  for  the  logistic  regression  model  for  one- variable 
case  with  a second-order  linear  predictor.  The  fonn  of  the  information  matrbe  is  the 
same  as  in  (5.G.U).  In  this  case, 


d(Da',XN*i) 

= tr(M(B«,9)M(iw4,„9)-‘)-p 
= u/'(ia.4.|,9)(trfs-m’(-.s’)’‘(st(i«+i  -i)‘ 
-2ml(j«+i  - if  + w/(iiv+i  — Hf  + Tn3(i«+i  — S)  - m.si 


(5.6.13) 


and  C'(i)  = 

If  we  r-oiisider  a symmetric  expefimenCal  region  centered  at  the  origin,  say  A = 

J^C'U)A^(Da,)~'CWdJ 

= -l/2a(usl  — TT^t  — s^)|2Isn  + 16mn®/3  + 8n^(mfx  + fsx*)  — 6n^e^+ 
2u(oV3  + I6a‘/5(us  - m*  - 2msS  - i$‘i‘  + stE^  + vli^  + 2mlf")| 

The  FVecliet  derivative  for  ^-optimality  is 
d(Dw.ia.4i)  = '->y^K'(*)M(D,y,e)-'M(r«+„e)M(Dw,9)-'C(*)!dz 
-4Q(M(D«,0),9) 

= 0)(u«(  - ij?L  — e*)"’(((i>  -I-  4xm  + 6x’s  + li*) 

-2i«+i(f7i  - 3f»  -e  ti^)  -e  !«*,(«  -h  -t-  aVSIIi«*i(iw+i  - 2i)  -f  s -t-  li*]* 
-l-2a*/3((t;-h4fm-f-6f’s-l- ti*)  - 2if*+i(m  -3is-Mf’)  + iiv+i(s -I- 

- 2i)  + s + tz^  +4aV3l("»-3is  + (i’)  -2iw+,(«-(-Is’)|’ 
+8nV5((m  - 3fs  + fJ^)  - 2ly+,(«  + I2^)]lxl^.,x  + 4aV7i*ia.+ix’)  -h  O(M,0) 

for  an  observation  at  a value  x,  of  tlie  explanatory  variable  z,  a Poisson  response  Vi 
ie  observed.  Here,  the  mean  response  is 

p(zi.fl)  = exp(ft)  + S|Z.) 

for  the  first-order  model,  and 


li{x„  6)  = pxp(9|]  + ffii,  + ^zf) 
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when  the  lineftr  predictor  is  of  tlie  second-order.  Also 
e,;  = Vor(!«)  = /i(i.,e) 


leouential  Deaiciis  for  Logistic  One- Variable  ( 


Let  yuV7.-  -.llt  l>c  Bernoulli  random  variables  with  iiieansp(xi,0),p[z3,6), 
...p(Xn>^)  respectively  where, 


B = (9g,9|)'.  \ow,  let  IIS  assume  0g  and  Bi  have  independent  prior  distributions  given 


li(go>=  ' ■ , Ai<ft)<6» 

(fcj  - 4i ) 

The  joint  distribution  of  6 will  then  have  the  prior 

To  find  a ^optimal  design  we  need  initially  at  least  two  distinct  design  points  to 
start  with.  Suppose  that  £>o  is  such  an  initial  design,  and  let  ]/g  = (pi.P})'  be  the 
data  vei'-tor  obtained  from  this  design.  For  convenience  let  us  denote  p(x,,  0)  by  p, 
from  now  on.  Then  the  likelihood  function  based  nn  vj  is, 

Therefore,  the  joint  distribution  of  the  data  and  the  parameter  is  as  follows, 


oc  f{vi\em 


« Wif-’Wi!-"' 

“ *^(1  +p(fc+*i«.l]»'  |1  +p<*.4.»,.,lj("i-K)(l+c(»>**'«>|’ 

= ‘[I  (l+p(«04«, «,)]"■ 

where  * — The  marginal  Hintribiuioii  of  v5  is  Si«;n  by. 


Mv;)  = f‘  Wa,9)-ie„<‘S, 

ft,  ^ p(e»-reiza)»9 

"It,  Jt,  *^[1  +r(«"+»i*'l]"‘  [1  +f(»«*»iw)j"' 

The  posterior  disiributiou  of  6 has  the  form 


’/b(0ty;) 


MvS.fl) 

u<i(v;) 





rt.  I*, 

it,  Jb, 


de^ds, 


Next,  the  Bayes  estimate  of  6 is  obtained,  which  is  used  to  obtain  a design  point  Zi 
at  which  8upjg^d(Z5oi»)  >*  attained.  We  have, 

«oB  = f‘  p 9oM9\yl)dBtde, 


= P P e^Me\yi)deode, 


Augmciii  Do  with  Xi.  Let  the  augineiiteri  design  and  rotre»)ionding  data  be  denoted 
by  D]  and  Now,  the  joint  distribution  of  9 ami  yj  {considering  ift>(9|y;)  to  be 
the  new  prior),  will  be 


^(^+#1*1)2:  (,{eo4ei£})n  ^(ea+eirjiei 

[1  + e(«o+*ixi)]  (]  + fteo+eml]  (i  + e(«o+<;H)| 


which  will  result  in  the  new  marpnal  distribution  of  y| 

t»i(y;)  = _/^£’h.(y:,0)dMe. 


Finally  the  posterior  distribution  of  B will  imvr  tlie.  form 


fti(y;,9) 

>'i(yi) 


After  Kiiding  the  Bayes  estimates  of  6q  aud  6i  at  tills  stage  using  Vti(^lv')>  one  more 
design  point  is  obtained  where  suPjjjjd(Di,i)  is  attained.  Let  the  augmented  design 
be  D2  and  the  data  observed  for  tlial  design  is  respectively.  Thus, 


[l+«t*»rei'|l]  [l+tlht*i'll]  (l+cl*0-»i«O]  " ' 


and  the  posterior  distribution  of  B will  be 


^ ^ [l4s<^*ein  ']"|i*^ira-ei  ^ 


which  will  be  used  as  a prior  to  0 In  the  next  step.  It  can  be  easily  shown  that  at  the 
mth  step,  the  posterior  distribution  of  0 will  take  the  form 


The  iterative  procedure  is  continued  until 


sup  < (S 

xeR 


where  j is  a small  positive  number 
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For  the  second-order  logistic  regression  model,  if  similarly  we  follow  the  above 
procedure,  6 = after  the  mth  stage  the  posterior  distribution  of  & s 

(^0,  ^1,^)'  will  take  the  form 


St:  Sti  A[9)de,dB,H9^ 


In  this  example,  ip,  and  Q-oplimal  designs  for  logistic  model  with  one  variable 
were  determined  using  both  sequential  and  Bayes-sequential  method-s.  First,  let  us 
consider  ^i-optimal  designs  using  Frequentist  sequential  method.  For  such  method, 
initial  values  ofS^and  0|  are  needed.  Initially,  the  means  of  the  uniform  distributions 
provided  by  Chaioner  and  Larntz  (1989)  were  used.  Starting  values  of  fig  and  9i  were 
0 and  7 respectively.  Initial  designs  were  also  assumed  within  Che  rt^on  (-1,1)  for 
both  optimality  cKteria.  Tlie  convergence  criterion  used  was  set  at  6 = 0.008.  We 
start  with  Che  initial  design  (,S,  —.5)'. 
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TABLE  5.1 


Steps  for  (ir  optima]  desigu  usiUK  sequential  method 


Initial  Bmgii 

Initial  Values  ol 

S'iP,6Rrf(On,l) 

«i(M,  0) 

0.5 

0 

0 

2.39 

-7.12 

-0.5 

1 

0.33 

-2.43 

1 

0 7 

0 

153.33 

-12.62 

•1 

-1 

-1 .26 

-1-61 

-0.1 

0 

0.33 

0.14 

-5.78 

0.1 

0 

-O.-IS 

-0.22 

-4.33 

-0.09 

0 7 

0.33 

0.55 

-5.96 

0.09 

0 

-0.46 

-0.09 

-4.36 

-0.08 

0 7 

0.33 

1.14 

-6.16 

0 

-0.48 

0.09 

-4.37 

0.085 

0 7 

0.33 

0.82 

-6.06 

-0.085 

0 

-0.47 

-0.002 

-4.36 

TABLE  5.2 


Effect  of  initial  vaiues  of  tbe  parameters  op  4i,-  optimal  desiirn 
using  sequentiaJ  method 


Initial  Design 

tuiiial  Values  of 

2n4-l 

MM.6) 

So  S, 

0.085 

0 10 

-0.23 

-0.43 

-6.20 

-0.085 

-1 

15.36 

-4.47 

0 

1 

0.58 

-1.74 

0.085 

0 1 

-1 

52.79 

-5.91 

-0.085 

0 

1 

0.55 

-1.88 

0.035 

0 4 

-0.59 

5.92 

-5.96 

•0.085 

-1 

3-47 

-2.85 

0.085 

0 -5 

-0.47 

3-19 

-5.98 

-0.085 

1 

5.71 

-3.26 

0 

•I 

-0.65 

-1.27 

120 


TABLE  5.2  coutd. 


Initial  Values  uF 

InUial  Deaigii 

ft,  S, 

^.41 

6i(M,  0) 

0.085 

0.3  7 

-0.35 

1.57 

-6-10 

-D.0B5 

1 

9.47 

-3.77 

0 

-1 

-0.39 

-1.42 

0.085 

0-9  7 

-0.39 

4.37 

-6.41 

-0.085 

0.96 

7.34 

-3.58 

0 

-0.96 

-0.52 

-1.43 

0.085 

1.5  7 

-0.« 

10.78 

-7.01 

-0.085 

0 

-0.94 

-0.30 

-3-47 

0.085 

2.5  7 

-0.53 

45.91 

-8.45 

-0.0R5 

-0.75 

-0.84 

-3.43 
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Fbr  ^1  optimality  (Table  5.1),  liie  initial  design  is  (.5, -.5)'.  The  nuiiimum  value 
of  sup^^^d(0„,2)  is  attained  at  0.33  after  adding  two  design  points.  Any  further 
addition  of  design  points  does  not  improve  tlie  convergence  criterion.  So  we  start 
with  another  initial  design  (1,-1)'.  Here  again,  minimum  sup^^jj d(D„,  a)  dors  not 
meet  the  convergence  criterion.  So  this  proretlnre  is  followed  by  trial  and  error  until 
we  Rnd  the  proper  initial  design  to  meet  the  convergence  criterion  and  finally  obtain 
the  optimal  rie.sign  whirh  Is 


Table  5.1  reveals  that  the  choice  of  the  initial  design  is  very  crucial  in  obtaining 
optimal  designs. 

Next,  the  Initial  values  of  $[,  and  6\  vrere  changed,  keeping  the  initial  design  the 
same  (T^le  5.2).  We  notice  tliat,  a different  set  of  initial  estimates  of  ^ and  5, 
produre  near-optimal  designs.  We  have  to  start  with  a different  initial  design  again 
to  obtain  a unique  optimal  design. 

Similar  conclusions  can  be  drawn  wliile  finding  a Q-oplima!  design.  Tables  5.3- 
5.4  display  the  procedures  to  obtain  optima!  designs  for  different  initial  designs  and 
dlflerent  initial  values  of  % and  £|. 
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TABLE  5.3 

Steps  for  Q-  optima]  desigu  osius  sequential  method 


Initial  Valuta  of 

Initial  Design 

8b  9, 

ih+i 

sup^gj^rf(On,ar) 

QiM.B] 

0.1 

0 7 

0.34 

75.50 

-76.51 

-0.1 

■1 

121.85 

-14.61 

0 

1 

0.19 

-2.51 

-0.5 

0 7 

0.33 

-11.08 

-26.69 

0.5 

0.94 

-0.75 

-4.82 

0 

-1 

-0.03 

-2.52 
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TABLE  5.4 

Effect  of  initial  values  of  tlio  parameters  oil  0-  optimal  design 
using  sequeutial  method 


Initial  Values  ut 

Initial  Design 

00  «1 

sup,,Rrf(D„,x) 

Q{M,e) 

-0.5 

0 10 

-0.24 

29.54 

-104-08 

0 

1 

-0.78 

-4.69 

0.5 

0 I 

-1 

0.90 

-4.22 

-0-5 

0 

-0.64 

-2.83 

0.5 

0 -5 

-0.44 

-6.76 

-12.07 

-0.5 

1 

1.10 

-3.24 

0 

-1 

-0.74 

-1.68 
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TABLE  5.4  coutd. 


Inltiel  Values  oF 

Iiiiii&l  Design 

So  S| 

Xn+I 

Q(M.6) 

Q.S 

2 7 

0.01 

-11.05 

-25.88 

0 

1 

-0.72 

-5.72 

0.5 

-3  7 

0.10 

-6,51 

-50.6, 

-0.5 

1 

-0.51 

-5.87 

The  Q-npilmal  desij'ii,  assuming  the  initial  values  of  Su  and  S|  to  be  0 and  7 
respectivriy,  is  given  by 

/ .5  \ 

-.5 

0 

.33 

.94 

, -1  , 

In  the  second  part  of  this  problem,  optimal  designs  using  Hayes  sequential  methods 
were  developed.  In  most  practical  situations,  one  can  collect  data  at  the  initial  design 
settings.  In  this  example,  ob.servations  were  generated  from  a binomial  distribution  for 
a logistic  regression  model.  The  values  of  the  success  probability  and  bence  the  values 
of  So  and  S|  are  needed  to  generate  binomial  r.v.'s.  So,  for  these  initial  setting,  values 
of  So  ettd  Si  were  randomly  geoerated  from  the  uniform  distributions  specided  within 
brackets  in  Tables  5.5-5.S.  Let  us  aasume  8q  is  uniform(-.l,.l)  and  Si  is  uniform(d,S) 
(Clialoner  and  Larntz,  1969).  We  start  with  the  initial  design  But  for  this 

combination  of  Initial  parameter  values  and  initial  design,  the  convergence  criterion 
is  not  satisfied  since  minimum  value  of  sup^gjjd(D„,i)  is  .21.  So  this  procedure  is 
followed  by  trial  and  error  until  we  find  the  optimal  design. 
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TABLE  5.5 

Steps  for  g|-  optimal  design  using  Baycs-gequential  method 


Initial  Design 

Initial  Values  of 

e.  9, 

-0.1 

l-.i,.i) 

(6,8) 

0.34 

0-21 

^.23 

0.1 

-0.01 

G.61 

-0.2 

(-.1,-1) 

(6,8) 

0.25 

-0.98 

-5.50 

0.2 

-0.01 

G.Gl 

-0.5 

(-1.1) 

(6,8) 

0 

2.09 

-6.97 

0.5 

-0.01 

G.01 

0.23 

-1.07 

-5.27 

-0.1 

(-1..1) 

(6.8) 

0.33 

0.16 

-5.78 

O.I 

0 

-0.01 

6,61 

-0.11 

(-1,-1) 

(6,8) 

0,33 

-0.15 

-5.63 

0.11 

•0.01 

G.G1 

-0.105 

(-1..1) 

(0,8) 

0.33 

-0.006 

-5.70 

0.105 

0 

-0.01 

6.61 

TABLE  5.8 


Effect  of  initial  values  of  the  parameters  on  0i-  optimal  design 
uaing  Bayes-sequentinl  method 


Initial  VaiufS  uf 

Initial  Design 

B, 

lu+l 

0i(M,  0) 

0.105 

0 

(-1..1)  (4,10) 

^1.01  0.78 

0.34 

0.09 

-5.69 

0.105 

0 

(-1..1)  (2,12) 
-0.0!  6.42 
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TABLE  5.6  coutd. 


Initial  Design 

Iiiilifil  Values  of 

MM,  9) 

$0  S, 

-0.105 

(-1,1) 

(6,8) 

0.34 

0.26 

-5.72 

0.105 

0 

-0.33 

6.61 

-0.105 

(-2-5,2.5) 

(6.8) 

0.46 

9.11 

-6.89 

0 

-0.82 

6.61 

0.49 

-1.06 

-4.38 

-0.105 

(-5,5) 

(6,8) 

0.62 

94.51 

-9.44 

0.105 

0 

-1 .64 

6.61 

0.76 

-0.86 

-5.15 
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TABLE  5.7 

Steps  for  Q-  optimal  desigu  using  Bayes-  sequential  metliod 


Iiiitiai  Values  of 

Initial  Design 

Oo 

^1 

QiM.e] 

0.5 

(-1.1) 

(6,8) 

0.34 

-10.55 

-23.57 

-0.5 

0 

-0.04 

6.67 

-0.29 

-6.67 

-15.53 

0.7 

-0.7 

0 

(-1.-1) 

-0.04 

(6,8) 

0.36 

0.8 

(-1.-1) 

(6,8) 

0.36 

23.45 

-53.27 

-0.8 

0 

-0.04 

6.67 

-0.30 

0.2S 

-24.17 

0.79 

(-1.1) 

(6,8) 

0.36 

19.84 

-51.51 

-0.79 

0 

-0.04 

6.67 

-0..30 

-0.15 

-23.82 

0.795 

(-1.-1) 

(6,10) 

0.36 

21.60 

-52.38 

•0.795 

0 

-0.04 

6.67 

-0.30 

0.06 

-24 

0.7935 

(-1.1) 

(6.10) 

0.36 

21.07 

-52,12 

-0.7935 

0 

-0.04 

6.67 

-0.30 

-0.003 

-23.94 

TABLE  5.8 


Effect  of  iuieial  values  of  the  parameters  on  Q-  optimai  design 
luiue  Bayes-  scgueutial  method 


Initial  Values  of 

Iniliftl  Design 

So  S, 

Q{M,e) 

0.7935 

(4,10) 

•0.47 

-6.02 

-14.52 

•0.7935 

U 

-0.CM 

6.01 

0.35 

-3.71 

-13.54 

0.7935 

1-1-1) 

(2,12) 

-0.68 

-3.23 

-5.20 

-0.7935 

0 

-0.04 

5.36 

0.64 

-2.02 

-3.69 
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TABLE  5.8  contd. 


Initial  Valut>s  oI 

Initial  D(«igii 

e, 

In+l 

<?(M,9) 

0.7935 

(-1.1) 

(6,10) 

0.41 

31.77 

-45.78 

-0.7935 

-0.39 

6.67 

-0.20 

-6.55 

-20.14 

0.7935 

(-2.5,2.5) 

(6,10) 

0.58 

-10.02 

-27.60 

-0.7935 

-0.98 

6.67 

Q.09 

-4.10 

-23.25 
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The  (ii-optimal  design  using  Bsyes-sequeiitinI  design  is 

(.105 
-.105 

while  the  Q-optimel  design  using  Bayes-sequential  method  is  found  to  be 

' .7935  ' 

-.7935 

0 

.36 

To  compare  the  four  optimal  designs,  particularly  the  sequential  vs.  Bayes- 
sequemial  designs,  we  define  the  following  efficiency  measures.  For  two  designs  Di 
and  0], 

t^j-efficiency  of  D\  with  respect  to  Dg  is 

Similarly,  Q-effieieney  is  defined  a-s 

_ <?(Af[Oi),e) 

Q{M{D,].9) 

Now,  since  for  both  sequential  and  Bayes-sequential  cases  the  optinial  design  max- 
imises either  or  Q-ctiteria,  and  since  and  Q{M,8)  are  negative  (t  and 

s both  being  frarxions,  * (M,  01  = fo9(!s)  will  proriut*  negative  values  as  described 
on  pg.  104),  we  will  consider  D,  to  be  more  efficient  than  Dj  if  the  efficiency  measure 
is  less  than  1 , 

Using  the  optimal  designs,  we  fit  the  lugjstir  regression  model  for  une  variable  case 
to  get  the  estimate  of  A/(.).  Then  (.))  and  13(Af  (.))  are  calculated  to  compare 
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thp  clesignE.  Following  Are  thf  results  obtained  nFler  fitting  all  the  four  models:- 
For  sequential  ijietliod,  0i(M,6)  = -2.018. 

For  sequential  method,  Q(M,  0)  = -0.C207. 

For  Eayes-sequential  method,  ^ (Af,  0)  = —2.102. 

For  BayeMequential  method.  Q{M,S)  = -.4420. 

The  di-effidency  of  sequential  with  respect  to  Bayes-sequential  is -962  and  Q-efficiency 
of  the  sequential  rs.  Bayes-sequential  is  1.402.  So,  iu  cuiiclusiou  we  can  say  that  for 
$i-optiniallty  criterion,  either  method  is  e<]Uivalently  optimal  while  for  Q-crIterlon, 
Bayes-sequential  Is  more  eflicieut  than  the  ordinary  sequentiai.  With  respect  to  the 
number  of  design  polnis,  Bayes-sequential  needs  fewer  points  than  the  sequential 
method.  In  summary,  we  can  conclude  that  Bayes-sequential  Is  more  cost-effective 
and  efhcieiit  than  the  ordinary  sequential  procedure. 


CHAPTER  G 


CONCLUDING  REMARKS  AND  FUTURE  RESEARCH  TOPICS 

In  the  teaching  of  linejir  models  there  seems  to  be  an  underlying  hierareliy  that 
is  followed.  The  first  linear  model  most  students  and/or  researchers  are  exposed  to 
is  one  in  which  the  standard  assumptions,  that  the  emus  are  normally  dislrihuted 
with  equal  variances,  are  made.  Parameter  estimation  and  prediction  of  response 
values  are  performed  using  least  squares  estimation.  Also  exact  F-tesLs  are  derived 
to  determine  tack  of  fit  of  the  assumed  model. 

The  next  step  up  this  luerarrhy  brings  us  to  normal  errors  with  unequal  variances. 
Weighted  least  squares  is  used  for  estintating  parameters  and  predicting  responses. 
But  for  tfstting  lack  of  fit,  due  to  randoiniiess  of  the  estimated  variance-covariance  ma- 
trix, it  is  not  possible  to  construct  an  exact  test.  In  this  dissertation,  an  approximate 
x’-teai  for  a particular  hypothesis  of  detecting  lack  of  fit  is  developed.  A drawback 
of  our  approximation,  however,  Ls  that  the  quantiles  below  the  median  are  nut  well 
approximated,  but  the  x’-approximatlon  is  very  Hose  for  all  the  quantiles  above  the 
median.  This  has  yet  to  be  resolved.  One  possible  soiutiuii  is  to  try  a vreighted 
choosing  the  weiglits  properly  to  approximate  the  rii.siribution  below  the  median  as 

The  assumption  of  independent  observations  witli  different  variances  was  made. 
Another  way  of  looking  at  this  problem  Is  to  assume  the  usual  unknown  variancc- 
coBvariance  matrix  hut  with  dependent  responses  forcing  the  nondisgnnal  elements  of 
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thf  v&rlaiire-covariaiicf*  matrix  to  bp  iioiizpro  iiiEtead  of  a diagonal  variancp-covariatica 
matrix.  In  tbis  case,  the  situation  baronies  more  complicated.  However,  if  equal 
mimbeni  of  replications  are  available  at  all  the  design  points,  then  the  results  of 
ZeJIner  (1962)  can  be  of  use  fur  extruding  the  proposed  approximate  test-statistic. 

The  concept  of  generalized  linear  models  (GLMs)  is  u.siially  not  introduced  in 
response  surface  methodology.  Herr  again  estimation  of  parameters  and  prediction 
of  rrsponsea  can  be  performed  using  weighted  least-squares  procedures  according 
to  standard  GLM  estimation  terhniques.  The  concept  of  modifying  the  standard 
GLM  estimation  method  to  reduce  tlie  prediction  variances  is  introduced  iti  this 
work.  Instead  of  a first-order  Taylor  series  expansion,  we  have  considered  estimation 
using  a second-order  approxiriiaiion.  Iii  tlie  devrlopirieiit  of  our  procedures,  we  made 
comparisons  among  the  prediction  variances  of  both  modified  and  standard  GLM  for 
logit  and  log  links.  Also,  it  would  be  of  interest  to  develop  the  same  procedure  for 
otlier  link  functions  such  as  probit,  log-log  and  complementary  log-log. 

Khuri  and  Myers  (1979)  proposed  the  modified  ridge  analysis  procedure  for  the 
optimiaation  of  second-order  responses  under  the  usual  assumption  of  uormal  errors 
with  equal  variances.  The  same  approach  was  used  in  the  next  step  of  optimization 
where  the  responses  are  assumed  Co  be  independently  normally  distributed  but  in 
this  case,  with  unequal  variances.  As  in  the  lack  of  fit  tests,  an  estimate  of  the 
variaiice-fovariancc  matrix  can  be  calculated.  So,  the  ridge  analysis  resmlts  now 
depend  on  the  randomness  of  this  estimate.  Turning  to  more  general  models,  we 
developed  a modified  version  uf  the  modified  ridge  analysis  under  GLM’s.  To  avoid 
the  randomness  of  the  estimated  £ matrix,  instead  of  using  individual  point  estimatf*. 
confidence  regions  were  constructed.  In  all  tlie  problems  discussed  above,  we  have 
assumed  iiidepeiident  observations.  A (lo-wible  exteusioii  is  to  relax  this  assumption 
and  work  witli  correlated  observations-  We  have  also  implemented  modified  GLM  in 
the  case  of  optimizing  a second-order  rrsponse  using  the  new  modified  ridge  analysis. 
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The  question  of  constructina  optimal  desi|ns  for  generalised  linear  models  seems 
Co  be  a difficult  one.  The  problem  is  the  design's  dependency  on  the  unknown  pa- 
rameter values.  This  could  only  be  solved  by  one  of  tbe  following  two  approaches: 
the  parameters  can  be  estimated  sequentially  until  a certain  optimality  criterion  is 
achieved  or  we  could  assign  prior  distribuciotts  to  the  parameters  and  perform  a 
Bayesian  analysis.  Some  work  has  been  done  in  logistic  regression  modeb  using  both 
parametric  and  Bayesian  approaches.  In  the  statbtics  literature,  values  of  the  un- 
known parameters  arc  assumed  known  in  order  to  obtain  optimal  designs.  Also  in 
case  of  Bayesian  analyses,  the  number  of  design  points  are  flsed  beforehand. 

In  this  thesb,  sequential  procedures  were  used  for  both  parametric  as  well  as 
Bayesian  analyses.  Initial  values  of  the  unknown  parameter  had  Co  be  speciHed.  The 
sequential  procedure  was  followed  until  a specific  optimality  criterion  was  satbfied. 
Hence,  the  number  of  design  points  does  not  have  to  be  fixed  before  Che  experiment, 
it  b determined  along  with  the  sequential  procedure.  Also,  four  different  optimality 
criteria  were  provided  for  a unified  general  link  function  and  a general  response. 
The  derivations  of  these  procedures  become  highly  complicated  beyond  second-order 
responses.  This  arsa  could  be  explored  in  the  future  for  more  efficient  or  approximated 
methods. 

In  the  Bayes-sequential  procedure,  a squared-error  loss  function  is  assumed.  Other 
types  of  different  loss  functions  rould  be  explored  to  find  optimal  designs.  Also,  we 
have  used  uniform  distributions  as  the  priors  for  the  unknown  parameters.  The  effect 
of  choice  of  difierent  priors  on  optimal  designs  could  be  checked  as  well. 


APPENDIX 


DifTprpiit  lypes  of  rompuier  iirograras  wpr?  us«l  for  ihe  compiitaiioiial  parts  of  ths 
dissertation  for  diffeteiiL  cliapters.  In  oliapcer  one,  tiie  convergence  of  tbe  parafneter 
estimates  were  formulated  using  S-plU9  functions.  Chapter  two  needed  a program  for 
50,000  simulations  of  the  tail  probabilities  of  t whidi  was  calculated  using  Fortran 
programs.  Subroutines  DGEMM,  DGEDI  and  DGEFA  were  taken  from  the  public 
donuiin  library  tu  use  in  the  main  simulation  program  for  matrix  multiplications 
and  Inverse  calculations.  Optimization  of  the  response  variable  using  extension  of 
the  modified  ridge  analysis  was  developed  in  chapter  three.  S-plus  functions  were 
created  for  this  calculation.  Finally,  for  the  optimal  designs,  Fortran  programs  were 
generated.  Subroutines  GAUS,  GAUSQUAD  and  OPTCRS,  OPTFNC,  OPTBND 
were  used  for  multiple  numerical  integration  and  function  minimisation. 
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